Recently a BI database that was produced via an ELT showed duplicate records so
I needed to diff the tables. A friend suggested using redis to diff the ids so
I went with it. We need the
redis-cli utility for this.
The fun part is doing it directly from shell.
Generating two sets:
I’ve found that this is reasonably fast without constructing raw redis protocol
for inserts via
Now we can generate a list of duplicate ids by doing
Then we can find the records by using the output from the above command like so
ghead is installed via
brew install coreutils on MacOS because BSD
does not support negative arguments