If someone gives you a Docker image as a tarball, i.e., the output of
docker save, and if you want to find a file that was deleted at some point and does not exist in the final image, how to do it?
- A Docker image is composed of layers (i.e., snapshots) of each Docker file command.
- Each layer in an image tarball is also itself a tarball.
A TLDR example solution (code can be found here):
The code shown write a list of all files in each layer to user-specified text file. Then, we can check that text file to find the desired file. Finally, extract that single file from the tarball.
Note, each layer was committed at the end of each Docker file command. So if the file was downloaded and removed in the same command, then there’s no way to extract that file from the image history. For example, if the Docker file has something like:
abc.tar.gz will not be found in history.