When JavaScript's Array.filter Isn't Enough
February 3, 2023
Want to see something weird? Instead of filtering an array with Array.filter
, as in:
// find all items with exactly four characters:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
const filtered = arr.filter((item) => item.length === 4)
// [ '2023', '2024' ]
We can instead use Array.flatMap
to do the same thing:
// find all items with exactly four characters:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
const mapped = arr.flatMap((item) => item.length === 4 ? item : [])
// [ '2023', '2024' ]
Now, you may be saying to yourself, "that's an incredibly dumb way to filter an array." To that I would say "you're absolutely right, it is an incredibly dumb way to filter an array."
So, would we ever want to filter an array this way? The simple truth is that no, we probably don't ever want to filter arrays with flatMap
. It's silly; we should just use filter
.
That said, Array.flatMap
doesn't just filter for us. It also maps the array for us. This let's us hijack it to filter and transform each element of an array:
// find and transform all items with exactly four characters:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
const mapped = arr.flatMap((item) => item.length === 4 ? {year: item} : [])
// [ { year: '2023' }, { year: '2024' } ]
To really see the power of this, though, let's go through a real-world example.
In this post:
A Practical Example
To show how we might be able to leverage this power of Array.flatMap
let's start with an interview-like prompt:
Given an array of strings:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
We need to extract a new array containing only the strings representing years. This new array should be in the format:
const newArr = [{ params: { year: '2023'}}, { params: { year: '2024'}}]
What is the best way to do this?
This may seem like an arbitrary problem you'd find in a technical interview and, well, I think I might start using it for that. But it does have real world applications, as I have discovered.
Some Background
This problem came up when I was building out the current iteration of my blog. I am using Next.js for the site and write all my blog posts in MDX. Because Next has built-in support for MDX as pages, I can just shuck my posts in the pages
directory and they'll be automatically rendered at the corresponding URL, thanks to Next's file-based routing. The structure of the directory containing my posts looks like this:
blog/
├── 2023/
│ ├── post1.mdx
│ └── post2.mdx
├── 2024/
│ ├── post3.mdx
│ └── post4.mdx
├── [year].tsx
└── index.tsx
This structure allows for automatic routing and nice-looking URLs (eg. jessereitz.com/blog/2023/post1
). Organizing posts by year also just feels good to my brain.
The [year].tsx
syntax might look a bit weird if you haven't used Next before. Basically, it's a dynamic route which let's you build out individual pages programatically. So instead of needing to create a page for each year's archive page (blog/2023/index.tsx
, blog/2024/index.tsx
, etc) I can just have Next generate one when it builds. To do so, I need to tell it which paths to build.
Getting the Paths
Next.js has a special function which can be exported from dynamic routes (eg. the [year].tsx
file) called getStaticPaths
. If provided, this function will get called during build time and will instruct Next.js to build a page for each path returned. In our case, the paths need to be returned in an array in the shape of:
const paths = [{ params: { year: '2023'}}, { params: { year: '2024'}}]
Each member of the paths
array represents an individual route. The year
attribute of the params
object matches up with the [year].tsx
. So the above array would result in pages at blog/2023
and blog/2024
but nothing else (eg. blog/2025
would 404).
Back to the Original Problem
In order to tell Next which paths to use for the [year]
dynamic route, I need to fetch a list of all of the "year" subdirectories within the blog
directory. Seems simple enough: just iterate through each member of the blog
directory and return it, right? Well...
Using Node's fs.readdir
and fs.readdirSync
functions we get an array of strings representing each member of the directory. There is no built-in way to fetch only those members which are directories.
import fs from 'fs'
fs.readdirSync('blog')
// ['2023', '2024', '[year].tsx', 'index.tsx']
We end up with an array of all the directories and files within the blog
directory. And so we're back to the original prompt. We have an array (['2023', '2024', '[year].tsx', 'index.tsx']
) and we need to filter it while transforming it to the shape of [{ params: { year: '2023'}}, { params: { year: '2024'}}]
.
The Most Straightforward Solution
The simplest solution to this problem is to just filter the array and them map it to the new shape:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
// Assume `isYear` is defined and returns `true` if the input is a 4-digit year
const filtered = arr.filter((obj) => isYear(obj))
const transformed = filtered.map((obj) => { params: { year: obj }})
// [{ params: { year: '2023'}}, { params: { year: '2024'}}]
This gets the job done and is readable, but I found a more neat way to do it.
The Neat Way
I realized I can hijack JavaScript's Array.flatMap
method to filter and map at the same time.
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
const filteredAndTransformed = arr.flatMap((obj) => isYear(obj) ? {params: { year: obj}} : [])
// [{ params: { year: '2023'}}, { params: { year: '2024'}}]
This has the benefit of being more concise and (potentially) more efficient. I can't say for certain, but it does stand to reason that the underlying mechanics of flatMap
would be such that it would only iterate through the original array once.
A very naive implementation of a flatMap
-like function could be something like this:
function myFlatMap(arr, callback) {
let newArr = []
for (let item of arr) {
const res = callback(item)
if (Array.isArray(res)) {
newArr = [...newArr, ...res]
} else {
newArr.push(res)
}
}
return newArr
}
In contrast, filtering then mapping requires two trips through the array. At worst, the filter
didn't actually remove any items and we need to iterate through the whole array twice.
How Array.flatMap
Works
Array.flatMap
intended purpose is to combine Array.map
and Array.flat
. The former maps each member of an array to a new array while the latter creates a new array by concatenating all elements of any subarrays.
Let's say we want an array of all the characters in an array of strings. So given an array:
const arr = ['2023', '2024']
We should end up with
const newArr = ['2', '0', '2', '3', '2', '0', '2', '4']
Array.map + Array.flat
We can start by converting each string into an array of its characters. This will give us a multi-dimensional array (an array of arrays). Then we can simply call the flat
method to make it a single-dimensional array:
const arr = ['2023', '2024']
const mappedArr = arr.map((obj) => Array.from(obj))
// [ [ '2', '0', '2', '3' ], [ '2', '0', '2', '4' ] ]
const finallArr = mappedArr.flat()
// [ '2', '0', '2', '3', '2', '0', '2', '4' ]
Array.flatMap
Alternatively, we can do this all in a single go with Array.flatMap
:
const arr = ['2023', '2024']
const finalArr = arr.flatMap((obj) => Array.from(obj))
// [ '2', '0', '2', '3', '2', '0', '2', '4' ]
Hijacking flatMap to Filter While Transforming
The cool thing about flatMap
is that if an empty array gets returned at any point while it's map
-ping the array, the empty array will just get thrown away during the flat
-tening of the final array.
So, in our case above:
const arr = ['2023', '2024', '[year].tsx', 'index.tsx']
const filteredAndTransformed = arr.flatMap((obj) => isYear(obj) ? {params: { year: obj}} : [])
You can see that for each item of the initial array arr
:
- we check if the item is a year
- if it is, we return it in its new shape
- if it is not, we return an empty array
When the array is then flattened, any items for which an empty array was returned will be filtered out.