If you think pixelizing a password or other text data will protect it from prying eyes, think again — your information may not be as safe as you’d imagine.
Pixelization (also known as mosaic) is a common coding method used to conceal information by dramatically reducing the resolution of sensitive areas in an image. For years, the technique has had broad applications in security and censorship — but its days may be numbered. “Depix” is a new AI-powered tool that can easily undo pixelization to enable recovery of the information therein. Uploaded this week, the project has already received nearly 10,000 stars on GitHub.
“I’ve seen companies pixelize passwords in internal documents. No tools were available for recovering a password from such an image, so I created one,” writes the Depix developer, Netherlands-based information security consultant Sipke Mellema. The tool works on images pixelized with a linear box filter, which overwrites a box of pixels with the average value of all pixels in the box. Says Mellema, “my algorithm attacks the common linear box filter.”
Mellema explains that because linear box filters are deterministic algorithms, pixelizing the same values will result in the same pixelated block, and pixelizing the same text using the same block locations will result in the same block values. Every block, or combination of blocks, can be considered a sub-problem.
To prepare a pixelated text sequence for Depix processing, first cut the relevant pixelated blocks from a screenshot image in a single rectangle format, then paste a De Bruijn sequence with expected characters into an editor with the same font settings (text size, font, colour, hsl). Next, take a screenshot of the sequence, preferably with the same screenshot tool previously used to capture the pixelized image.
Finally, simply run python depix.py -p [pixelated rectangle image] -s [search sequence image] -o output.png
Since the original information under the pixels is lost, it is impossible to directly reverse the filter. For most pixelized images, the tool tends to find single-match results for blocks, which it assumes are correct. Matches of surrounding multi-match blocks at the same geometrical distance as in the pixelized image are also treated as correct. When the correct blocks have no more geometrical matches, they are output, while the average of all matches is output for multi-match blocks.
The developer says the Depix technique “beautifully links to vulnerable patterns in cryptography. It’s similar to hash cracking, exploiting the use of ECB, and the utilization of known-plaintext attacks.” He advises people to avoid obfuscation techniques on sensitive data, warning the “assumption that a schema can’t be broken, just because the implementer doesn’t know how, is a common pitfall in information security.”
The Depix project is on GitHub.