First Impressions

The challenge comes with an zip file called dump.zip. Unzipping it results in a file called hello.txt containing the following text:

There's more to it than meets the eye...

From the challenge description and the text from hello.txt, we can guess that there are some hidden files. I checked the ZIP file with binwalk, and we can find A LOT of hidden files.

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             Zip archive data, v0.0 compressed size: 41, uncompressed size: 41, name: hello.txt
135           0x87            Zip archive data, v0.0 compressed size: 33, uncompressed size: 33, name: hi.txt
256           0x100           Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag00
345           0x159           Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag00
434           0x1B2           Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag00
523           0x20B           Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag00
612           0x264           Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag00
...
60598         0xECB6          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
60687         0xED0F          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
60776         0xED68          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
60865         0xEDC1          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
60954         0xEE1A          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
61043         0xEE73          Zip archive data, v0.0 compressed size: 1, uncompressed size: 1, name: flag18
61572         0xF084          End of Zip archive, footer length: 22

The names range from flag00.zip to flag18.zip, and each file name appears 36 times. Interesting, maybe we can find something if we check the strings in this ZIP file?

V~uK)
hello.txtThere's more to it than meets the eye...
V~uK)
hello.txtPK
hi.txtFind a needle in the haystack...
hi.txtPK
flag00aPK
flag00PK
flag00bPK
flag00PK
flag00cPK
flag00PK
...
flag18{PK
flag18PK
flag18CPK
flag18PK
flag18TPK
flag18PK
flag18FPK
flag18PK
flag180PK
flag18PK
flag181PK
flag18PK
flag183PK
flag18PK
flag187PK
flag18PK
flag18}PK
flag18PK
flag18_PK
flag18PK

Each file name is followed by a character, which is one of the following: all lowercase letters (a-z), {, C, T, F, 0, 1, 3, 7, }, _. From the range of characters, it looks like extracting the right files would make the flag. We need to find out which files to extract.

Searching for the Solution

I came across this Stack Overflow post in my searches. What caught my eye was the name APPNOTE.TXT in the question, which is the name of this challenge.

APPNOTE.TXT happens to be the document containing the file format specification for the ZIP file. With that I figured that I would have to look into the hex bytes for something.

The second thing that caught my eye was the term central directory. I knew a bit about the ZIP file structure, but needed to look it up more in detail.

Central directory: a short introduction

The central directory consists of:

central directory file headers, which point to files in the archive and contain additional metadata.¹
end of central directory record or EOCDR, which is used by zip tools to correctly read and extract files from the archive. This structure contains information about the archive and points to the start of the central directory, which further points to the data in the archive.² ³

File structure from APPNOTE.TXT:

4.3.16  End of central directory record:

      end of central dir signature    4 bytes  (0x06054b50)
      number of this disk             2 bytes
      number of the disk with the
      start of the central directory  2 bytes
      total number of entries in the
      central directory on this disk  2 bytes
      total number of entries in
      the central directory           2 bytes
      size of the central directory   4 bytes
      offset of start of central
      directory with respect to
      the starting disk number        4 bytes
      .ZIP file comment length        2 bytes
      .ZIP file comment       (variable size)

The document also mentioned that a ZIP file must have only one EOCDR. So, maybe checking the EOCDR segment of the ZIP file will help find out which files to extract?

Solution

The EOCDR begins with the hex bytes 50 4B 05 06 (big-endian). After opening the file in a hex editor, I searched for the hex bytes. And...there's multiple results, specifically 21 of them?

50 4B 05 06 00 00 00 00 01 00 01 00 00 EE 00 00 CC 00 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 5A E4 00 00 88 0A 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 93 D7 00 00 65 17 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 CC CA 00 00 42 24 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 69 BF 00 00 BB 2F 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 CE B6 00 00 6C 38 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 29 A5 00 00 27 4A 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 E7 9C 00 00 7F 52 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 42 8B 00 00 3A 64 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 21 86 00 00 71 69 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 71 73 00 00 37 7C 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 66 70 00 00 58 7F 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 E3 59 00 00 F1 95 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 AC 52 00 00 3E 9D 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 A2 47 00 00 5E A8 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 8E 33 00 00 88 BC 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 9A 2A 00 00 92 C5 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 16 1C 00 00 2C D4 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 38 15 00 00 20 DB 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 2F 02 00 00 3F EE 00 00
50 4B 05 06 00 00 00 00 01 00 01 00 34 F0 00 00 50 00 00 00

When checking for hidden files in dump.zip, we found a total of 21 unique file names:

hello.txt
hi.txt
flag00 to flag18

OH! Each EOCDR corresponds to each of the files in the archive. And since we have multiple files with the same name for flag00 to flag18, the EOCDR will help to point out which file to extract, LETSGO!

To understand the data in the records better, here is a representation with one of the records as an example:

50 4B 05 06 00 00 00 00 01 00 01 00 5A E4 00 00 88 0A 00 00

50 4B 05 06     end of central dir signature 
00 00           number of this disk
00 00           number of the disk with the start of the central directory
01 00           total number of entries in the central directory on this disk
01 00           total number of entries in the central directory
5A 34           size of the central directory
00 00 88 0A     offset of start of central directory with respect to the starting disk number
00 00           .ZIP file comment length

The data significant for the solution is the offset of start of central directory with respect to the starting disk number. In short, this points to the start of the central directory.

In the above example, the address is 0x0000880a in big-endian, or 0x00000a88 in little-endian. I entered the address (in little-endian format) in the hex editor and it points to the central directory header (50 4B 01 02).

We can see the character C one byte before the file header. As the flag begins with CTF, I know that this is the correct character.

A similar process can be followed remaining records, and we get our flag! Of course, I wanted to make a script to make this process easier, so I did.

#!/usr/bin/env python3
import binascii

with open('dump.zip', "rb") as file:
	content = file.read()
hexadecimal = binascii.hexlify(content)

flag = []
for i in range(0, len(hexadecimal), 32):
	line = hexadecimal.decode('utf-8')[i:i+32]
	if "0506" in line:
		f = line.find("0506")

		# offset of start of central directory
		address = hexadecimal.decode('utf-8')[i+f+28:i+f+36]

		# converting address to little-endian format
		ba = bytearray.fromhex(address)
		ba.reverse()
		s = ''.join(format(x, '02x') for x in ba)

		# retriving the byte before the file header, i.e., the character
		index = int(s, 16) - 0x1
		flag.append(chr(content[index]))

print(''.join(flag).strip('\n'))

$ python3 appnote-sol.py
CTF{p0s7m0d3rn_z1p}

Flag: CTF{p0s7m0d3rn_z1p}

Footnotes

Florian Buchholz. The Structure of a PKZip File ↩
Hans Wennborg. Zip Files: History, Explanation and Implementation ↩
Wikipedia. ZIP (File Format) ↩