I have a dataset with something like 65 million rows in it, in a tab-delimited .txt file, with some extraneous quotation marks. I have been able to import a related dataset with 51 million rows and 2 columns without any issue.
But even when I suppress quote binding, Stata seems to be crapping out on importing it. It's supposed to be able to handle 20 billion observations. There's only like 50 columns too.
Now, it is eating up a ton of my memory while it imports. But what I'm wondering is: do I just need to leave it overnight and let it burn? Or do I need to fix a setting? Because doing the 51 million rows with 2 variables only took like 30 seconds to import.
It just doesn't make sense to me that a 2*51,000,000 matrix would take 30 seconds but a 50*51,000,000 would take >2 hours, since if processing time is stable by matrix size (which idk if it is) it should be about 10-15 minutes import time.
In terms of filesize, the one I handled successfully was 897 mb of .txt, whereas the troublesome one is 15.4 gb.
For the curious, I have 16 GB installed RAM.
So after it runs the import for about 3 minutes this is what happens, EVEN IF I tell it to only import *the first column*, which is numeric. Image
So, my question is: should I just let it run and it will eventually pound through and get the job done? Or is this a situaion where I need to actually do something differently?
Also, the real crisis here: it uses up enough memory that my Imperator savegame as Cappadocia doesn't run smoothly, and if I don't knock of Mithridates, he'll soon become a real threat.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Lyman Stone 石來民 🦬🦬🦬

Lyman Stone 石來民 🦬🦬🦬 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @lymanstoneky

Jan 28
This is a nice illustration of a little issue that is underappreciated:

The scale of "births to women who arrived in America while pregnant or conceived immediately upon arrival" is a whooooole lot bigger than "nonresident births" would imply.
This is sometimes called "birth tourism" but I don't think that's a valid framing. Often it's recent immigrants, often legal, having a child upon arriving. "We're Americans now, let's make an American!"
Or it'll be something like "I'm pregnant, we've been considering moving to America for a while, WE NEED TO DO IT BEFORE THE BABY IS BORN" etc
Read 5 tweets
Jan 28
i like to imagine the whole ukraine thing is actually: Putin planned a stupidly huge wargame as an act of intimidation with absolutely no plan to invade Ukraine, then the west was like, "let's make it a macho contest and see if we can get Putin to back down like a cuck"
what i like about this theory is it implies literally everyone involved is an idiot motivated entirely by prideful vanity with a cavalier disregard for the actual interests of their countries or ukraine

this seems like a very plausible read of the parties involved.
in this scenario, putin was like "hah, i will march large phallic symbols around ukraine's borders, and put my big boats in the big water, and everyone will know russians are manly men, and you should fear us"
Read 8 tweets
Jan 28
This is technically true, but has a couple problems:
1) This would require 1-in-10 households to take the *time* to do this. Tending a 1/2 acre vegetable garden is a lot of work!
2) It's true nationally, but not locally. And since 1/2 acre farmers are not likely to be able to organize or fund long-range shipping for their products, this means much of the country would rarely get vegetables.
3) Most of the US has similar seasonality anyways, so relying entirely in local vegetable production means relying on local seasons, which means no fresh veggies for large chunks of the year
Read 16 tweets
Jan 27
so giving moms cash didn't give their babies super magic brains after all?

shocking
SSC has a nice rundown on the various critiques. Basically, their only marginally significant result was *not* a pre-registered analysis, differed from all prior research, and also became insignificant after correcting for multiple hypothesis testing. astralcodexten.substack.com/p/against-that…
in other words, they did the study, then did the pre-registered analysis, found nothing, did an *extra* analysis, found something juuuust at the edge of significance, and despite their own math showing it was not sig. after mult-hyp adj., wrote an abstract saying it was sig.
Read 4 tweets
Jan 27
was reading a demography article a few days ago saying we live in times of "unprecedented uncertainty" and i feel like people just do not realize how unstable life before modernity was.
we do not in fact live in times of great uncertainty.

we live in times of extremely high expectations of stability, and a small increase in uncertainty versus the period when those expectations formed.
but even with COVID it's not like you were rolling the dice on if you'd live to see 30 or something
Read 6 tweets
Jan 21
What's so dumb about this is that the EXACT SAME AUTHORS of the paper reviewed here ALSO did a paper studying ONLY ADOPTIONS, and find that the negative effects on women are EXACTLY THE SAME.
Here's the paper. They ask "Does biology drive the motherhood penalty?" and they find the answer is "No, not at all." aeaweb.org/articles?id=10…
Adopted children hurt mothers' incomes just as much as gestational children.

What impacts mothers' career trajectory is child*rearing* more than child*bearing*, tho having a baby and putting up for adoption probably impacts as well.
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(