There are a lot of changes & new features coming to #rstats#tidycensus in April; you can try out the new features now with `remotes::install_github("walkerke/tidycensus")`. Read on in this thread to get up to speed on the updates: github.com/walkerke/tidyc…
`get_acs()` and `get_pums()` now both default to the brand-new 5-year American Community Survey estimates. If you need other years, be sure to use the `year` argument to get data for that year
However, the 1-year ACS experimental estimates _are not_ available in tidycensus, and requesting them will throw an error. If you need 1-year ACS data, be sure to explicitly request data for a different year using the `year` argument
Analysts using the 5-year ACS detailed tables are commonly confused when variables come back as NA for a given geography. A new geography column in `load_variables()` output tells you the smallest geography at which a given variable is available!
`get_acs()` also now supports the ACS Comparison Profile, which is great for time-series analysis and making appropriate (e.g. inflation-adjusted) comparisons. Look up variable codes for 2016-2020 with `load_variables(2020, "acs5/cprofile")`
I've tweeted this before, but it's worth repeating: the new `as_dot_density()` function makes data prep for dot-density maps a breeze. Try modifying the example in the docs (?tidycensus::as_dot_density) for a location of your choice, and try out the dasymetric option too!
I'm in the process of integrating all of these updates into my book "Analyzing US Census Data" (walker-data.com/census-r/) so keep an eye out there for full documentation. These updates should be on CRAN sometime in April.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
According to the Census Bureau, all files on the public FTP server have been made publicly unavailable to comply with the President's executive orders.
They say they'll work to restore the files after they are reviewed and approved.
These inaccessible files include Census / ACS flat files, TIGER/Line shapefiles, and much of the data documentation / guidebooks on the website.
However: you CAN still access data via API as well as on .data.census.gov
For demographic data, the tidycensus & censusapi #rstats packages, and the censusdis #Python package, offer convenient interfaces to the Census API.
Isochrones show you the reachable area from a location for a given travel mode.
In #rstats, creating isochrones with @Mapbox
web services is easier than you think.
A 🧵 on getting started with isochrones using Mapbox and R:
In the mapboxapi R package, the function `mb_isochrone()` helps you calculate isochrones with some extra features to make your life easier.
For example, `mb_isochrone()` is integrated with Mapbox's geocoder so you can create isochrones directly from addresses
Alternatively, if you have an sf POINT object you can calculate isochrones in bulk over each location. Use the `id_column` argument to associate your isochrones with your input points
The latest release of Quarto integrates "lightbox" functionality for images with the option `lightbox: true`. Click to highlight your image for the audience:
The option `code-line-numbers` allows you to incrementally step through lines of code, which is excellent for programming walkthroughs.
The syntax `#| code-line-numbers: "|4|5"` allows me to step through tidycensus options incrementally as I present them!
You already know that #tidycensus gets you pre-cleaned US Census data ready to analyze and map in #rstats.
But did you know it includes other features to make your data science projects easier?
Let's take a tour of a few features in this 🧵:
Dot-density maps are a staple in ArcGIS, but have traditionally been slow to produce in R
Use the function `as_dot_density()` to speed up this process, and for the US use the argument `erase_water` to avoid placing dots in water areas!
A common task with US Census data is to compare data over time, but the Census Bureau re-draws small areas every 10 years
The `interpolate_pw()` function helps you with fast population-weighted areal interpolation, which is typically more accurate than using area weighting