Using Neptyne to explore the population growth of the US States

3 min readDec 11, 2022

As the United States continues to grow and develop, understanding the population changes of its individual states can provide valuable insights into the country’s overall progress and demographics. In this post, we will show we can use Neptyne, the programmable spreadsheet to import the population data, refine that data and visualize the growth in an appealing way in just a few lines of Python while taking advantage of the spreadsheet that Neptyne is.

Let’s start by importing the data:

def merge_rows(*tbls):
    res = defaultdict(dict)
    years = set()
    for tbl in tbls:
        for _, row in tbl.iterrows():
            for year, v in row.items():
                    years.add(year)
                    if state == "United States":
                        continue
                    res[row['Name']][year] = int(v)
    return years, {state: [pops.get(year) for year in years] for state, pops in res.items()}

def load_data():
    tbls = pd.read_html(
        "https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_historical_population"
    )
    years, states = merge_rows(tbls[0], tbls[2], tbls[3])
    Populations!B1 = [["Population", *years], 
                      *[[state, *population] for state, population in states.items()]]

The load_data() function in the code imports population data for each US state from a Wikipedia page. It then uses the merge_rows() function to process the data and create a table containing the population data for each state. This table is stored in a sheet called “Populations” in the spreadsheet.

The merge_rows() function takes in one or more tables of data, each represented as a Pandas DataFrame, and extracts the relevant data from each row, merging rows of the same state. it returns the years and the population table.

To create a cartogram using a spreadsheet, we can use the grid structure of the spreadsheet to represent the states. We start by assigning each state to a cell in the spreadsheet by entering its state code in that cell.

To display the state population for a certain year, we can fetch the population data from the Populations sheet and set the background color of each cell to correspond with the population of that state. This allows us to visualize the state populations using the grid structure of the spreadsheet.

Matplotlib has a nice way to pick colors on a scale:

cmap = LinearSegmentedColormap.from_list(‘rg’,[“r”, “y”, “b”], N=256)

Which creates a color scale from red through yellow to blue with 256 steps. Setting up the background color of the cell then becomes:

rel_pop = int(256 * (results[code] — lmn) / (lmx — lmn))
r, g, b, _ = [int(x * 255) for x in cmap(rel_pop)]
state.set_background_color(r, g, b)

Anything we can learn? The map from 1790 above is already quite interesting. The populations are closer to each other than today and New York is not particularly dominant. A hundred years later things look different!

It’s just after the Civil War and we’re hitting peak industrialization and you can see it in the population patterns. Indiana has more people than Texas, Florida and California combined. The overall population has exploded from 3.8 million to ten times that.

Jump another 150 years to our current time and we can see a dramatic shift to the south and to the coast. California, Texas and Florida are the heavy hitters now and especially the states in the middle, up from Wyoming all the way down to Mississippi have grown a lot less than most.

If you want to play with the data or the visualization, hop on over to Neptyne — if you don’t have an account get one here.

Using Neptyne to explore the population growth of the US States

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Douwe Osinga

No responses yet

More from Douwe Osinga

Visualizing Sea level rise

Something like 15 years ago I put together some geographical data I found online containing European elevation and created a GIF out of it…

Visualizing Poetry using DALL-E

Over the weekend I ran an experiment to use DALL-E 2 to automatically illustrate poems. I had been playing with this thing a few months…

Leaving Triposo

[cross posted from my personal blog]

Use DALL-E to create infinite zoom movies

Ever since DALL-E launched a few months ago, OpenAI’s machine learning tool has been blowing the Internet’s mind with its uncanny ability…

Recommended from Medium

I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

It literally took one try. I was shocked.

Our Statement on the Passing of President Carter

For decades, you could walk into Maranatha Baptist Church in Plains, Georgia on some Sunday mornings and see hundreds of tourists from…

Lists

Coding & Development

Predictive Modeling w/ Python

Practical Guides to Machine Learning

ChatGPT

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Common side effects of not drinking

By rejecting alcohol, you reject something very human, an extra limb that we have collectively grown to deal with reality and with each…

Trump Fiddles While Musk Burns Down MAGA

Musk unleashes his anger on MAGA Republicans and vows "war."

Ten predictions for data science and AI in 2025

On agents, open source models, safety, and more