Spreadsheets – Depict Data Studio https://depictdatastudio.com Fri, 13 Mar 2026 14:02:28 +0000 en-US hourly 1 https://wordpress.org/?v=7.0 File Naming Tips https://depictdatastudio.com/file-naming-tips/ https://depictdatastudio.com/file-naming-tips/#respond Fri, 13 Mar 2026 14:02:25 +0000 https://depictdatastudio.com/?p=16669 more »]]> File-naming tips so you don’t lose your work:

✖️ DON’T “save” the same file over and over and over.

All programs crash, especially Excel.

All files get corrupted sometimes, especially Excel.

Saving the same file over and over = more likely to lose your work.

✔️ Instead, “SAVE AS” a new file each work session.

I put the dates in the file name, not “v1” or “v2” or “final-final.”

I use YYYY-MM-DD format so they line up chronologically, not “Jan, Feb, Mar” or “3-2026” accidentally preceding “4-2025.”

I also write brief notes to myself about the focus of that work session (in case I mess something up and need to go back to a previous version).

A coworker taught me the “SAVE AS” technique almost 20 years ago, and when my files get corrupted and garbled and lost entirely, I’m so grateful – because I’ve only lost a single work session of time, not the entire project.

]]>
https://depictdatastudio.com/file-naming-tips/feed/ 0
Splitting with =TEXTBEFORE and =TEXTAFTER in Excel https://depictdatastudio.com/splitting-with-textbefore-and-textafter-in-excel/ https://depictdatastudio.com/splitting-with-textbefore-and-textafter-in-excel/#respond Mon, 22 Sep 2025 15:08:00 +0000 https://depictdatastudio.com/?p=16488 In this video, you’ll see 3 ways to “split” data in Excel:

  • Text to columns
  • Textsplit
  • Textbefore and textafter

Then, you’ll learn how to use =TEXTBEFORE and =TEXTAFTER:

Download the Excel File

And practice yourself: https://depictdatastudio.kit.com/textbeforetextafter

Transcript

[00:00:00] In this video, you’re gonna practice text before and text after, which are methods of splitting.

And I want you to practice this along with me.

So look down below this YouTube video in the description, and you’ll see a link to download this for free.

Okay, let me give you some context about the project and then we will get into the actual text before and text after functions.

Recently I was working on a project that looked like this, where I had, you know, a bunch of data. I had things like ID numbers, I had country codes, and I wanted to find country names, and I was gonna fill them in with good old lookup formulas, which are beyond the scope of this video. But, you know, with like a V, H, X, or index match to fill them in.

I needed to find a list of the codes and the names. I went to like good old copilot and I just asked it. I was like, “Hey, this is what I need. I need country names, I need country codes.” And within seconds it gave me the list.

I [00:01:00] tried copying it and it gave me like all this messed stuff: Afghanistan space, Space, space, space, space, space, space, pipe, space, af.

And a lot of times people are like, “Ann, like just type it in by hand,” et cetera. I don’t have time to do that for like 50 bajillion countries and codes. That’s where splitting formulas come to the rescue.

There’s different types of splitting. This is not an exhaustive list. These are just the ones that are the most related.

I tried to put some notes here just as like a quick, you know, a quick cheat sheet for you about the versions and pros and cons.

There’s text to columns. Available in all the versions, it’s buttons. It’s a wizard. It’s great, unless you are using uppercase T Excel Tables. It has to be done manually. I don’t love this. That means Future You, Future Ann has to spend more time doing it over and over and over.

Text split, amazing, but it’s not in all the versions of Excel. It doesn’t [00:02:00] work with Tables. It spills into nearby cells, which can like be a tricky thing to work around.

So enter text before and text after, which we’re focusing on today.

They’ve been around for a while, so a lot of your coworkers and colleagues that you share files with probably have them at this point.

They are formulas, which means Future you can easily replicate this. And they don’t spill, which is amazing.

Okay, let’s get into the how tos. I’ll zoom way in. I’ll demo and then remember, you should download this spreadsheet and try it yourself.

The goal is to have country name and then over here in a separate column, the country code.

Text before is gonna grab this whole cell, that whole text, comma, the delimiter, like what is it that separates it?

Well, it’s a pipe which is on my keyboard. It’s between backspace and enter is what that little symbol is. [00:03:00]

Now A3 is a cell reference. You can see the color codes. It’s like a specific location, so I don’t need any quotes around it. But the pipe is not a cell reference, so I have to surround it with the double quotes because it’s like things I’m selecting off of my keyboard.

Okay, so this is gonna grab everything before the pipe.

Country code comes after. Okay, text after. I wanna split out this cell. That’s all smooshed together. And the delimiter is the pipe, which again goes inside double quotes, as you know. Mm-hmm. Okay. So you get Afghanistan AF, and then you can fill these all the way down and they work for you. Okay.

One thing I am curious about though, remember how it’s like Afghanistan, space, space, space, space, space. I feel like to like really sleep well at night, I would probably wanna trim off the extra spaces off both of these, just to be like extra sure. I’m probably going to onion layer them, nest them inside a trim.

[00:04:00] Then I can know they’re like completely, completely perfect.

Have fun playing around with text before and text after. I love these time savers and I hope you do too.

]]>
https://depictdatastudio.com/splitting-with-textbefore-and-textafter-in-excel/feed/ 0
Make a Unique List in Excel https://depictdatastudio.com/make-a-unique-list-in-excel/ https://depictdatastudio.com/make-a-unique-list-in-excel/#comments Mon, 08 Sep 2025 15:08:00 +0000 https://depictdatastudio.com/?p=16482 Do you need a list of all the possible categories, without any duplicates?

Here’s how you make a unique list in Excel with a feature called Remove Duplicates:

]]>
https://depictdatastudio.com/make-a-unique-list-in-excel/feed/ 1
Two Types of Datasets: Contiguous vs. Non-Contiguous https://depictdatastudio.com/contiguous-datasets-a-critical-prerequisite-for-useful-data-visualization/ https://depictdatastudio.com/contiguous-datasets-a-critical-prerequisite-for-useful-data-visualization/#comments Tue, 05 Nov 2024 16:08:00 +0000 https://depictdatastudio.com/?p=15188 “Ann, I loved your training, but I’m having trouble applying what I learned. Something’s off with my datasets, and the graphs are taking forever!”

This past year, I’ve spent more time teaching about data management than data visualization.

When I look under the hood of companies’ spreadsheets, I’ve noticed way too many data management issues that could be avoided altogether.

In this article, you’ll learn about a prerequisite for data visualization: contiguous datasets.

Benefits of Contiguous Datasets

In this video, you’ll see how a single contiguous dataset lets you use:

  • ONE set of formulas for data cleaning, recoding, and analyses
  • ONE set of pivot tables
  • ONE set of charts

Then, at the end, you can slice and dice your charts with a variety of filters.

Mini Datasets Spread Across One Sheet – NO!

Here’s what I often see:

Separate datasets for each time period.

NOOOOOOOOOOOOOOOOO!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Sometimes there are dozens of mini datasets, like this:

Mini Datasets Spread Over Multiple Sheets – NO!

Or, just as terrible for graphs and dashboards — one mini dataset per sheet, like this.

NOOOOOO!!!!!!!!!!!!!!!!!!!!!!

Or, separate mini datasets spread across different Excel files altogether.

NOOOOOOOOO!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Ann, What’s So Bad about Mini Datasets?!

Separate mini datasets (“non-contiguous” or “non-touching” datasets) mean that we can only look at one time period at a time.

We have to make a bunch of mini charts.

It takes forever to make these the first time, and they’re a huge pain to update over time.

It’s also tougher for our viewers to find patterns because the numbers are scattered across too many charts.

NOOOOOO!!!!!!!!!!!!!!!

Dataviz Prerequisite: A Single Contiguous Dataset

Instead, the numbers should be stored in a single dataset, with the timeframe as its own column, like this:

This running list of new entries — a log — is going to get very long.

In real-life projects, the logs might have hundreds of thousands of entries.

That’s okay!!!!!!!!!!!!!!!!! That’s preferred!!!!!!!!!!!!!!!!!!!!

It’s counterintuitive, but contiguous logs make dataviz faster, not slower.

Excel can handle millions of entries.

The length of a dataset won’t make your analysis or visualization take any longer. Repeat after me: Contiguous logs make dataviz faster, not slower.

However…

The width — the number of columns — can certainly take a while, because there are so many different variables to consider.

Bonus: Save Your table as an Excel Table for Easier Updating

A table is the generic term for a collection of rows and columns.

An Excel Table is a special feature that makes it faster and easier to update our log.

In other words, Excel Tables make it easier to append our contiguous logs as we get new data.

How to Turn tables into Excel Tables

You’ll simply click on your contiguous log — your generic table.

Then, go to the Insert tab.

Choose a Table.

Click OK.

You’ll recognize the banded rows.

Adding New Entries to Datasets Stored as Excel Tables

Adding new entries — or appending — is easy.

Let’s pretend you’re downloading data from your organization’s database. You might only be able to download one month at a time into its own sheet. That’s okay!

We’ll simply copy and paste those new entries into our running log.

Then, we’ll add the timeframe to that right-most column, too.

Excel is smart, and it’ll know that your new entries are part of your new dataset. In other words, your new entries will feed into pivot tables and formulas seamlessly.

Contiguous Datasets are Required for Static Dashboards

Want a short handout, PDF, or email attachment to share with others?

Maybe you’d want to see how all the projects combined are doing.

Or, maybe you’d want a breakdown of the different projects.

You could even add quick vizzes like sparklines to see trends, like this:

Contiguous datasets are required in order to make static dashboards.

Otherwise the sumifs, countifs, and averageifs behind the scenes will be impossible. Or, the formulas will be painfully slow to set up.

Static dashboards should take less than an hour to design from start to finish.

If it’s taking longer than that, it’s probably because (a) you don’t have a contiguous dataset or (b) you need more practice with formulas.

Contiguous Datasets are Required for Interactive Dashboards

Want to make interactive dashboards in Excel?

Your technical coworkers will love exploring the insights for themselves.

Interactive dashboards involve four pieces:

  1. A single contiguous dataset stored as a regular ol’ table or an Excel Table. You already know I prefer Excel Tables for datasets that are going to be added to or appended in the future.
  2. Pivot tables to tabulate the numbers (and bypass formulas, which can be tricky for novices).
  3. Pivot charts to, you know, visualize the numbers.
  4. Slicers (a fancy name for the filters).

Once again, contiguous datasets are the foundation of data visualization.

Have I sold you on contiguous datasets yet???

Contiguous datasets are required for:

  • Making a single graph to show comparisons over time (not January, February, and March in separate graphs that take three times as long to create and update);
  • Making static dashboards with formulas and trendlines that’ll update (nearly) automatically as you add new entries to your log; and
  • Making interactive dashboards with charts that’ll update (nearly) automatically as you add new entries to your log.

If your data visualization is taking too long… it’s usually a data management problem.

And it can be easily fixed!!!

Start storing all your non-contiguous datasets as a single contiguous dataset.

]]>
https://depictdatastudio.com/contiguous-datasets-a-critical-prerequisite-for-useful-data-visualization/feed/ 5
Two Types of Tabulations: Formulas vs. Pivot Tables https://depictdatastudio.com/two-types-of-tabulations-formulas-vs-pivot-tables/ https://depictdatastudio.com/two-types-of-tabulations-formulas-vs-pivot-tables/#comments Mon, 28 Aug 2023 15:08:00 +0000 https://depictdatastudio.com/?p=15238 You learned about two types of tables: datasets vs. tabulations.

Then, you learned about two types of datasets: contiguous vs. non-contiguous.

Now, let’s learn about two types of tabulations: formulas vs. pivot tables.

Tabulation Option 1: Formulas

Formulas and pivot tables are both correct… in different circumstances.

Here are the pros and cons of each approach so you can figure out which one you’ll need.

Formulas:

  • are necessary for tabulating numbers;
  • are faster for datasets with matching columns;
  • play well with quick vizzes;
  • give us full control over tabulations; and
  • give us full control over charts; but
  • involve a learning curve.

Formulas: Necessary for Tabulating Numbers

In Simple Spreadsheets, we talk about the calculations needed for different types of variables: nominal, ordinal, interval, and ratio.

When it comes to formulas, we can put these variables into two buckets: numbers and categories.

Numbers are test scores, ages, number of people, amount of money donated, etc.

For numbers, we need to tabulate them using descriptive statistics, which often aren’t possible with pivot tables.

Descriptive statistics for numbers might include:

  • Measures of central tendency (=average, =median, =mode)
  • Measures of dispersion (=stdev, =var, =min, =max, and range)
  • Characterizing the distribution (=skew, =kurt)
  • Quartiles (=quartile)
  • Percentiles (=percentile)
  • Outliers (There are multiple ways to define and deal with outliers; in many projects, we use +/- 3 standard deviations different from the mean)

Formulas: Faster for Datasets with “Matching” Columns

Years ago, I demonstrated how to tabulate satisfaction survey data with “matching” columns.

In the fictional-but-inspired-by-real-projects dataset, each survey question was in its own column.

Every survey question had the same options: strongly agree, agree, disagree, and strongly disagree.

In other words, this dataset had matching columns.

In this 5-minute video, you’ll see how we can write one formula, and then drag it down and across to quickly tabulate matching columns.

Formulas: Play Well with Quick Vizzes

Formulas feed seamlessly into at-a-glance visualizations, like spark lines, data bars, heat tables, and symbol fonts.

(Pivot tables don’t.)

Formulas: Give Us Full Control over Tabulations

Need to compare your numbers to a target?

Need to see how much the numbers have changed over time (e.g., percent change or percentage changes from month to month)?

These tabulations can be tedious or impossible with pivot tables.

Formulas: Give Us Full Control over Charts

We can make a billion different charts in Excel. Here’s an incomplete listing of the Excel vizardry that’s possible with good ol’ Excel.

Want to make a native chart? One of the common built-in charts, like bars, columns, pies, and lines? Pivot tables will feed into native charts just fine.

Want to make a non-native chart? Population pyramids, dots, lollipops, swarms, b’arcs, tile grid maps, diverging stacked bars, etc.? Advanced vizardry is only possible with magic tables, which have formulas underneath, not pivot tables.

For example, if you want to make a swarm plot (a.k.a. jittered dot plot), like this:

Swam plots are non-native charts, so we’ll need formulas behind the scenes to have full control over the chart’s creation and formatting, like this:

Formulas: Expect a Learning Curve

Sure, most people know the absolute basics, like sum and average.

But there are 450+ formulas and functions inside Excel.

Knowing which ones you need… at which point in the analytical process to use them… and how to use them… That takes training and practice.

Tabulation Option 2: Pivot Tables

Pivot tables are a drag-and-drop solution for tabulating our datasets.

In other words, we don’t have to write any formulas! No need to stress over jargon like “” or () or , or A1:A100.

Pivot tables are:

  • great for novices;
  • great for tabulating categories;
  • faster for cross-tabulations;
  • slightly faster for appended tables and recurring analyses;
  • way faster for mismatched columns; and
  • necessary for interactive dashboards.

Pivot Tables: Great for Novices

Let’s start with the biggest benefit of choosing pivot tables over formulas: there’s a minimal learning curve, so pivot tables are perfect for novices.

Here’s an older blog post that shows you how to get started with pivot tables within minutes. You’ll insert a brand new pivot table, and then drag and drop variables into the little boxes.

Sure, there are nuances:

  • switching the units from sums and counts;
  • double-clicking to explore mysterious entries and outliers;
  • placing two variables in the values box (e.g., counts and their percentages); and
  • refreshing the pivot table as new entries are added to the dataset.

But, anyone and everyone can learn the basics within minutes — supervisors who don’t have time to delve into the details of formulas, graphic designers who don’t need to conquer statistics, grantmakers who need to focus on the actual philanthropy and not statistical formulas, etc.

Pivot Tables: Great for Tabulating Categories

Formulas are great for numbers, because we’ll need to calculate descriptive statistics like mean, median, mode, standard deviation, variation, quartiles, percentiles, skewness, and kurtosis, among many others.

Pivot tables are great for categories, because we’ll need to calculate frequencies (like how many people).

Yes, we can also calculate frequencies with formulas (countifs, for example).

Pivot Tables: Faster for Cross-Tabulations

A regular ol’ tabulation might be the number of males and female employees.

A cross-tabulation adds another variable or two, like the number of male and female employees in each state.

Yes, we can do cross-tabulations with formulas, too (another perfect opportunity for countifs). But especially for novices, the drag-and-drop functionality is going to be faster than adding to an existing formula.

Pivot Tables: Slightly Faster for Appended Datasets with Recurring Analyses

Need to add to your dataset over time?

Maybe you collect daily outbreak data, like many public health agencies I work with.

Or, maybe you collect quarterly data from grantees, like many foundations I work with.

(Or some other time period — like weekly, or annually, or whatever.)

As you add to your dataset — your contiguous log — you can simply refresh your pivot table and it’ll incorporate the latest numbers. That means that the chart(s) linked to your pivot table will update with the latest numbers, too! Woohoo!

Yes, it’s easy to update formulas as we append datasets, too.

You simply create one anchor formula — the formula in the upper-left of your tabulation — and drag it across and/or downwards to fill all the cells, like this:

Pivot Tables: Way Faster for Mismatched Columns

Earlier, I said I prefer formulas for matching columns (e.g., all the columns contain agree-disagree response options).

I prefer pivot tables for mismatched columns (e.g., one column has agree-disagree options, another column has birthdates, another column has addresses, and so on).

It would be a huge pain to add so many different formulas along the bottom of my dataset! I might need countifs for one column, and sumifs for another column, and averageifs for another column… meh.

Pivot Tables: Necessary for Interactive Dashboards

To build interactive dashboards in Excel, you’ll need to create pivot tables, then pivot charts, then slicers.

To the best of my knowledge, interactive dashboards have to be built off pivot tables, not formulas.

Here’s an example of an interactive dashboard that’s linked to pivot tables:

The Bottom Line

There are two ways to tabulate your dataset: through formulas, or through pivot tables.

Formulas:

  • are necessary for tabulating numbers;
  • are faster for datasets with matching columns;
  • play well with quick vizzes;
  • give us full control over tabulations; and
  • give us full control over charts; but
  • involve a learning curve.

Pivot tables are:

  • great for novices;
  • great for tabulating categories;
  • faster for cross-tabulations;
  • slightly faster for appended tables and recurring analyses;
  • way faster for mismatched columns; and
  • necessary for interactive dashboards.

Neither option is terrible. Neither option is perfect.

As usual, there are pros and cons.

Your Turn

When do you tabulate your datasets with formulas vs. pivot tables?

This isn’t an exhaustive list of pros and cons. What am I missing??

]]>
https://depictdatastudio.com/two-types-of-tabulations-formulas-vs-pivot-tables/feed/ 1
Two Types of Tables: Datasets vs. Tabulations https://depictdatastudio.com/two-types-of-tables-datasets-vs-tabulations/ https://depictdatastudio.com/two-types-of-tables-datasets-vs-tabulations/#comments Mon, 21 Aug 2023 15:08:00 +0000 https://depictdatastudio.com/?p=15205 Last week’s blog post about contiguous vs. non-contiguous datasets was immensely unpopular.

I had the most unsubscribes to my blog and newsletter of all time — in more than a decade of blogging, YouTubing, and newsletter-ing.

One person said something like this:

“I think the issue is you’re a visualization expert and visually the mini sets are easier. From a data prep perspective, one really long table is the correct way to store the underlying data. Dealing with dozens of tables that should just be a single set is a typical rookie mistake.”

Let’s chat more about that distinction: storing underlying data vs. tables that look nice visually.

Two Types of Tables

The term “table” is tricky.

At its core, a table is just a collection of rows and columns.

But you’ll need different types of tables at different phases in the data analysis and visualization process.

Here’s the major distinction you need to understand:

  1. Datasets are tables where your data is stored.
  2. Tabulations are tables where those datasets are summarized.

Let’s look at each type in more detail.

Type 1: Datasets

The first type of table is a dataset, which is where your data is stored.

Sort-of synonyms:

  • Raw data: This is a sort-of synonym. The term raw means the data hasn’t changed since you received it (i.e., a coworker emailed it to you); since you downloaded or exported it (i.e., from a public-facing website, or from your agency’s database); or since you or someone else manually-entered it.
  • Clean data: This is a sort-of synonym. The term clean means the data has changed since you received it. You checked for duplicates and missing data; you checked for and dealt with outliers; and/or you cleaned and recoded variables (e.g., by transforming a MM-DD-YYYY into Q1, Q2, Q3, or Q4, among hundreds of other recodings that are often necessary).
  • Master dataset: This is a direct synonym — and this is the term I learned in undergraduate and graduate statistics courses — but we don’t use slavery terms anymore. I’ve been hunting for a better term for a couple years. If you’re in Simple Spreadsheets then you’ve heard me talk about this a lot. I’ve experimented with the terms central headquarters or hub to replace master dataset, but none of them felt right. The term that currently feels most accurate is contiguous dataset.

Datasets: Contiguous vs. Non-Contiguous

Datasets should be contiguous, i.e., touching or sharing a border.

If you want to be efficient, that is.

Non-contiguous datasets — dozens of mini datasets located across different sheets or Excel files — lead to wasted time, wasted money, and wasted brainpower.

Datasets: Stored as Excel Tables for Easy Appending

Datasets should be stored as Excel Tables when you need to append them later, i.e., if you’ll be adding to them.

You can learn more about contiguous vs. non-contiguous datasets and tables vs. Excel Tables in this blog post. The Simple Spreadsheets course is all about data management and analysis, too.

Type 2: Tabulations

The second type of table is a tabulation. Tabulations are tables where the datasets are summarized.

For example, the dataset might have one entry per project. The tabulation might show the totals and/or averages across all the projects.

Datasets and tabulations have different purposes. They’re used at different points in the analytical process. They look different. They are different.

Synonyms:

  • Summary table
  • Summary statistics
  • Report
  • Key metrics

How to Tabulate the Dataset

You’ve got two options in Excel:

  1. Formulas (sumifs, countifs, averageifs, lookups, etc.) will play nicely with the quick vizzes (below). They require more skill and practice, though.
  2. Pivot tables will play nicely with the interactive dashboard (below). Anyone can learn pivot tables within minutes, so I often recommend them for the beginner/intermediate crowd.

This distinction deserves its own blog post, too. In all my “spare” time, ha! We also talk about the distinctions between formulas and pivot tables in detail inside Simple Spreadsheets.

Tabulations: Can Be the End Product (meh)

The tabulation might be the end product that you share with others.

I suppose you could email the summary table to colleagues. You could post it on a website, or share it on a slide.

Except… meh.

Why not bring those visuals to life?!

Tabulations: Can Feed into Mini-Graphs

Why not add quick vizzes to bring tabulations to life?!

Sparklines, data bars, heat tables, and symbol fonts are my go-to’s.

Visuals make it easier for our brains to spot patterns. It’s obviously faster to look at a viz than to read all the numbers.

Your quick vizzes might look like this:

If you format the sheet for easy printing and PDF’ing, then voila!, you’ve got a static dashboard.

Static dashboards like these are great for internal audiences that (1) need a quick turnaround time and (2) want lots of details from the actual tabulations.

Tabulations: Can Feed into Big Graphs and Dashboards

Tabulations can also feed into larger graphs (for documents and slides).

Or, tabulations can feed into larger graphs for interactive dashboards.

Your interactive dashboard in Excel might look something like this:

The Bottom Line

“Table” is a tricky term. It’s broad and generic. It means different things to different people.

There are two main types of tables:

  1. Datasets are the underlying data source. You might have one entry (one row) per person, or per organization, or per project. Datasets should be contiguous because.
  2. Tabulations are the summary tables. You might tally-up how many people, or how many organizations, or how many projects. Tabulations might be your end product (yawn!). Or, they might feed into graphs and dashboards (yay!).

We need both datasets and tabulations. But these are different types of tables.

]]>
https://depictdatastudio.com/two-types-of-tables-datasets-vs-tabulations/feed/ 2