| source: http://www.pkware.com/documents/casestudies/APPNOTE.TXT␊ | 
| ␊ | 
| File:    APPNOTE.TXT - .ZIP File Format Specification␊ | 
| Version: 6.3.1 ␊ | 
| Revised: April 11, 2007␊ | 
| Copyright (c) 1989 - 2007 PKWARE Inc., All Rights Reserved.␊ | 
| ␊ | 
| The use of certain technological aspects disclosed in the current␊ | 
| APPNOTE is available pursuant to the below section entitled␊ | 
| "Incorporating PKWARE Proprietary Technology into Your Product".␊ | 
| ␊ | 
| I. Purpose␊ | 
| ----------␊ | 
| ␊ | 
| This specification is intended to define a cross-platform,␊ | 
| interoperable file storage and transfer format.  Since its ␊ | 
| first publication in 1989, PKWARE has remained committed to ␊ | 
| ensuring the interoperability of the .ZIP file format through ␊ | 
| publication and maintenance of this specification.  We trust that ␊ | 
| all .ZIP compatible vendors and application developers that have ␊ | 
| adopted and benefited from this format will share and support ␊ | 
| this commitment to interoperability.␊ | 
| ␊ | 
| II. Contacting PKWARE␊ | 
| ---------------------␊ | 
| ␊ | 
| PKWARE, Inc.␊ | 
| 648 N. Plankinton Avenue, Suite 220␊ | 
| Milwaukee, WI 53203␊ | 
| +1-414-289-9788␊ | 
| +1-414-289-9789 FAX␊ | 
| zipformat@pkware.com␊ | 
| ␊ | 
| III. Disclaimer␊ | 
| ---------------␊ | 
| ␊ | 
| Although PKWARE will attempt to supply current and accurate␊ | 
| information relating to its file formats, algorithms, and the␊ | 
| subject programs, the possibility of error or omission cannot ␊ | 
| be eliminated. PKWARE therefore expressly disclaims any warranty ␊ | 
| that the information contained in the associated materials relating ␊ | 
| to the subject programs and/or the format of the files created or␊ | 
| accessed by the subject programs and/or the algorithms used by␊ | 
| the subject programs, or any other matter, is current, correct or␊ | 
| accurate as delivered.  Any risk of damage due to any possible␊ | 
| inaccurate information is assumed by the user of the information.␊ | 
| Furthermore, the information relating to the subject programs␊ | 
| and/or the file formats created or accessed by the subject␊ | 
| programs and/or the algorithms used by the subject programs is␊ | 
| subject to change without notice.␊ | 
| ␊ | 
| If the version of this file is marked as a NOTIFICATION OF CHANGE,␊ | 
| the content defines an Early Feature Specification (EFS) change ␊ | 
| to the .ZIP file format that may be subject to modification prior ␊ | 
| to publication of the Final Feature Specification (FFS).  This␊ | 
| document may also contain information on Planned Feature ␊ | 
| Specifications (PFS) defining recognized future extensions.␊ | 
| ␊ | 
| IV. Change Log␊ | 
| --------------␊ | 
| ␊ | 
| Version       Change Description                        Date␊ | 
| -------       ------------------                       ----------␊ | 
| 5.2           -Single Password Symmetric Encryption    06/02/2003␊ | 
| storage␊ | 
| ␊ | 
| 6.1.0         -Smartcard compatibility                 01/20/2004␊ | 
| -Documentation on certificate storage␊ | 
| ␊ | 
| 6.2.0         -Introduction of Central Directory       04/26/2004␊ | 
| Encryption for encrypting metadata␊ | 
| -Added OS/X to Version Made By values␊ | 
| ␊ | 
| 6.2.1         -Added Extra Field placeholder for       04/01/2005␊ | 
| POSZIP using ID 0x4690␊ | 
| ␊ | 
| -Clarified size field on ␊ | 
| "zip64 end of central directory record"␊ | 
| ␊ | 
| 6.2.2         -Documented Final Feature Specification  01/06/2006␊ | 
| for Strong Encryption␊ | 
| ␊ | 
| -Clarifications and typographical ␊ | 
| corrections␊ | 
| ␊ | 
| 6.3.0         -Added tape positioning storage          09/29/2006␊ | 
| parameters␊ | 
| ␊ | 
| -Expanded list of supported hash algorithms␊ | 
| ␊ | 
| -Expanded list of supported compression␊ | 
| algorithms␊ | 
| ␊ | 
| -Expanded list of supported encryption␊ | 
| algorithms␊ | 
| ␊ | 
| -Added option for Unicode filename ␊ | 
| storage␊ | 
| ␊ | 
| -Clarifications for consistent use␊ | 
| of Data Descriptor records␊ | 
| ␊ | 
| -Added additional "Extra Field" ␊ | 
| definitions␊ | 
| ␊ | 
| 6.3.1         -Corrected standard hash values for      04/11/2007␊ | 
| SHA-256/384/512␊ | 
| ␊ | 
| ␊ | 
| V. General Format of a .ZIP file␊ | 
| --------------------------------␊ | 
| ␊ | 
| Files stored in arbitrary order.  Large .ZIP files can span multiple␊ | 
| volumes or be split into user-defined segment sizes. All values␊ | 
| are stored in little-endian byte order unless otherwise specified. ␊ | 
| ␊ | 
| Overall .ZIP file format:␊ | 
| ␊ | 
| [local file header 1]␊ | 
| [file data 1]␊ | 
| [data descriptor 1]␊ | 
| . ␊ | 
| .␊ | 
| .␊ | 
| [local file header n]␊ | 
| [file data n]␊ | 
| [data descriptor n]␊ | 
| [archive decryption header] ␊ | 
| [archive extra data record] ␊ | 
| [central directory]␊ | 
| [zip64 end of central directory record]␊ | 
| [zip64 end of central directory locator] ␊ | 
| [end of central directory record]␊ | 
| ␊ | 
| ␊ | 
| A.  Local file header:␊ | 
| ␊ | 
| local file header signature     4 bytes  (0x04034b50)␊ | 
| version needed to extract       2 bytes␊ | 
| general purpose bit flag        2 bytes␊ | 
| compression method              2 bytes␊ | 
| last mod file time              2 bytes␊ | 
| last mod file date              2 bytes␊ | 
| crc-32                          4 bytes␊ | 
| compressed size                 4 bytes␊ | 
| uncompressed size               4 bytes␊ | 
| file name length                2 bytes␊ | 
| extra field length              2 bytes␊ | 
| ␊ | 
| file name (variable size)␊ | 
| extra field (variable size)␊ | 
| ␊ | 
| B.  File data␊ | 
| ␊ | 
| Immediately following the local header for a file␊ | 
| is the compressed or stored data for the file. ␊ | 
| The series of [local file header][file data][data␊ | 
| descriptor] repeats for each file in the .ZIP archive. ␊ | 
| ␊ | 
| C.  Data descriptor:␊ | 
| ␊ | 
| crc-32                          4 bytes␊ | 
| compressed size                 4 bytes␊ | 
| uncompressed size               4 bytes␊ | 
| ␊ | 
| This descriptor exists only if bit 3 of the general␊ | 
| purpose bit flag is set (see below).  It is byte aligned␊ | 
| and immediately follows the last byte of compressed data.␊ | 
| This descriptor is used only when it was not possible to␊ | 
| seek in the output .ZIP file, e.g., when the output .ZIP file␊ | 
| was standard output or a non-seekable device.  For ZIP64(tm) format␊ | 
| archives, the compressed and uncompressed sizes are 8 bytes each.␊ | 
| ␊ | 
| When compressing files, compressed and uncompressed sizes ␊ | 
| should be stored in ZIP64 format (as 8 byte values) when a ␊ | 
| files size exceeds 0xFFFFFFFF.   However ZIP64 format may be ␊ | 
| used regardless of the size of a file.  When extracting, if ␊ | 
| the zip64 extended information extra field is present for ␊ | 
| the file the compressed and uncompressed sizes will be 8␊ | 
| byte values.  ␊ | 
| ␊ | 
| Although not originally assigned a signature, the value ␊ | 
| 0x08074b50 has commonly been adopted as a signature value ␊ | 
| for the data descriptor record.  Implementers should be ␊ | 
| aware that ZIP files may be encountered with or without this ␊ | 
| signature marking data descriptors and should account for␊ | 
| either case when reading ZIP files to ensure compatibility.␊ | 
| When writing ZIP files, it is recommended to include the␊ | 
| signature value marking the data descriptor record.  When␊ | 
| the signature is used, the fields currently defined for␊ | 
| the data descriptor record will immediately follow the␊ | 
| signature.␊ | 
| ␊ | 
| An extensible data descriptor will be released in a future␊ | 
| version of this APPNOTE.  This new record is intended to␊ | 
| resolve conflicts with the use of this record going forward,␊ | 
| and to provide better support for streamed file processing.␊ | 
| ␊ | 
| When the Central Directory Encryption method is used, the data␊ | 
| descriptor record is not required, but may be used.  If present,␊ | 
| and bit 3 of the general purpose bit field is set to indicate␊ | 
| its presence, the values in fields of the data descriptor␊ | 
| record should be set to binary zeros.␊ | 
| ␊ | 
| D.  Archive decryption header:  ␊ | 
| ␊ | 
| The Archive Decryption Header is introduced in version 6.2␊ | 
| of the ZIP format specification.  This record exists in support␊ | 
| of the Central Directory Encryption Feature implemented as part of ␊ | 
| the Strong Encryption Specification as described in this document.␊ | 
| When the Central Directory Structure is encrypted, this decryption␊ | 
| header will precede the encrypted data segment.  The encrypted␊ | 
| data segment will consist of the Archive extra data record (if␊ | 
| present) and the encrypted Central Directory Structure data.␊ | 
| The format of this data record is identical to the Decryption␊ | 
| header record preceding compressed file data.  If the central ␊ | 
| directory structure is encrypted, the location of the start of␊ | 
| this data record is determined using the Start of Central Directory␊ | 
| field in the Zip64 End of Central Directory record.  Refer to the ␊ | 
| section on the Strong Encryption Specification for information␊ | 
| on the fields used in the Archive Decryption Header record.␊ | 
| ␊ | 
| ␊ | 
| E.  Archive extra data record: ␊ | 
| ␊ | 
| archive extra data signature    4 bytes  (0x08064b50)␊ | 
| extra field length              4 bytes␊ | 
| extra field data                (variable size)␊ | 
| ␊ | 
| The Archive Extra Data Record is introduced in version 6.2␊ | 
| of the ZIP format specification.  This record exists in support␊ | 
| of the Central Directory Encryption Feature implemented as part of ␊ | 
| the Strong Encryption Specification as described in this document.␊ | 
| When present, this record immediately precedes the central ␊ | 
| directory data structure.  The size of this data record will be␊ | 
| included in the Size of the Central Directory field in the␊ | 
| End of Central Directory record.  If the central directory structure␊ | 
| is compressed, but not encrypted, the location of the start of␊ | 
| this data record is determined using the Start of Central Directory␊ | 
| field in the Zip64 End of Central Directory record.  ␊ | 
| ␊ | 
| ␊ | 
| F.  Central directory structure:␊ | 
| ␊ | 
| [file header 1]␊ | 
| .␊ | 
| .␊ | 
| . ␊ | 
| [file header n]␊ | 
| [digital signature] ␊ | 
| ␊ | 
| File header:␊ | 
| ␊ | 
| central file header signature   4 bytes  (0x02014b50)␊ | 
| version made by                 2 bytes␊ | 
| version needed to extract       2 bytes␊ | 
| general purpose bit flag        2 bytes␊ | 
| compression method              2 bytes␊ | 
| last mod file time              2 bytes␊ | 
| last mod file date              2 bytes␊ | 
| crc-32                          4 bytes␊ | 
| compressed size                 4 bytes␊ | 
| uncompressed size               4 bytes␊ | 
| file name length                2 bytes␊ | 
| extra field length              2 bytes␊ | 
| file comment length             2 bytes␊ | 
| disk number start               2 bytes␊ | 
| internal file attributes        2 bytes␊ | 
| external file attributes        4 bytes␊ | 
| relative offset of local header 4 bytes␊ | 
| ␊ | 
| file name (variable size)␊ | 
| extra field (variable size)␊ | 
| file comment (variable size)␊ | 
| ␊ | 
| Digital signature:␊ | 
| ␊ | 
| header signature                4 bytes  (0x05054b50)␊ | 
| size of data                    2 bytes␊ | 
| signature data (variable size)␊ | 
| ␊ | 
| With the introduction of the Central Directory Encryption ␊ | 
| feature in version 6.2 of this specification, the Central ␊ | 
| Directory Structure may be stored both compressed and encrypted. ␊ | 
| Although not required, it is assumed when encrypting the␊ | 
| Central Directory Structure, that it will be compressed␊ | 
| for greater storage efficiency.  Information on the␊ | 
| Central Directory Encryption feature can be found in the section␊ | 
| describing the Strong Encryption Specification. The Digital ␊ | 
| Signature record will be neither compressed nor encrypted.␊ | 
| ␊ | 
| G.  Zip64 end of central directory record␊ | 
| ␊ | 
| zip64 end of central dir ␊ | 
| signature                       4 bytes  (0x06064b50)␊ | 
| size of zip64 end of central␊ | 
| directory record                8 bytes␊ | 
| version made by                 2 bytes␊ | 
| version needed to extract       2 bytes␊ | 
| number of this disk             4 bytes␊ | 
| number of the disk with the ␊ | 
| start of the central directory  4 bytes␊ | 
| total number of entries in the␊ | 
| central directory on this disk  8 bytes␊ | 
| total number of entries in the␊ | 
| central directory               8 bytes␊ | 
| size of the central directory   8 bytes␊ | 
| offset of start of central␊ | 
| directory with respect to␊ | 
| the starting disk number        8 bytes␊ | 
| zip64 extensible data sector    (variable size)␊ | 
| ␊ | 
| The value stored into the "size of zip64 end of central␊ | 
| directory record" should be the size of the remaining␊ | 
| record and should not include the leading 12 bytes.␊ | 
| ␊ | 
| Size = SizeOfFixedFields + SizeOfVariableData - 12.␊ | 
| ␊ | 
| The above record structure defines Version 1 of the ␊ | 
| zip64 end of central directory record. Version 1 was ␊ | 
| implemented in versions of this specification preceding ␊ | 
| 6.2 in support of the ZIP64 large file feature. The ␊ | 
| introduction of the Central Directory Encryption feature ␊ | 
| implemented in version 6.2 as part of the Strong Encryption ␊ | 
| Specification defines Version 2 of this record structure. ␊ | 
| Refer to the section describing the Strong Encryption ␊ | 
| Specification for details on the version 2 format for ␊ | 
| this record.␊ | 
| ␊ | 
| Special purpose data may reside in the zip64 extensible data␊ | 
| sector field following either a V1 or V2 version of this␊ | 
| record.  To ensure identification of this special purpose data␊ | 
| it must include an identifying header block consisting of the␊ | 
| following:␊ | 
| ␊ | 
| Header ID  -  2 bytes␊ | 
| Data Size  -  4 bytes␊ | 
| ␊ | 
| The Header ID field indicates the type of data that is in the ␊ | 
| data block that follows.␊ | 
| ␊ | 
| Data Size identifies the number of bytes that follow for this␊ | 
| data block type.␊ | 
| ␊ | 
| Multiple special purpose data blocks may be present, but each␊ | 
| must be preceded by a Header ID and Data Size field.  Current␊ | 
| mappings of Header ID values supported in this field are as␊ | 
| defined in APPENDIX C.␊ | 
| ␊ | 
| H.  Zip64 end of central directory locator␊ | 
| ␊ | 
| zip64 end of central dir locator ␊ | 
| signature                       4 bytes  (0x07064b50)␊ | 
| number of the disk with the␊ | 
| start of the zip64 end of ␊ | 
| central directory               4 bytes␊ | 
| relative offset of the zip64␊ | 
| end of central directory record 8 bytes␊ | 
| total number of disks           4 bytes␊ | 
| ␊ | 
| I.  End of central directory record:␊ | 
| ␊ | 
| end of central dir signature    4 bytes  (0x06054b50)␊ | 
| number of this disk             2 bytes␊ | 
| number of the disk with the␊ | 
| start of the central directory  2 bytes␊ | 
| total number of entries in the␊ | 
| central directory on this disk  2 bytes␊ | 
| total number of entries in␊ | 
| the central directory           2 bytes␊ | 
| size of the central directory   4 bytes␊ | 
| offset of start of central␊ | 
| directory with respect to␊ | 
| the starting disk number        4 bytes␊ | 
| .ZIP file comment length        2 bytes␊ | 
| .ZIP file comment       (variable size)␊ | 
| ␊ | 
| J.  Explanation of fields:␊ | 
| ␊ | 
| version made by (2 bytes)␊ | 
| ␊ | 
| The upper byte indicates the compatibility of the file␊ | 
| attribute information.  If the external file attributes ␊ | 
| are compatible with MS-DOS and can be read by PKZIP for ␊ | 
| DOS version 2.04g then this value will be zero.  If these ␊ | 
| attributes are not compatible, then this value will ␊ | 
| identify the host system on which the attributes are ␊ | 
| compatible.  Software can use this information to determine␊ | 
| the line record format for text files etc.  The current␊ | 
| mappings are:␊ | 
| ␊ | 
| 0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems)␊ | 
| 1 - Amiga                     2 - OpenVMS␊ | 
| 3 - UNIX                      4 - VM/CMS␊ | 
| 5 - Atari ST                  6 - OS/2 H.P.F.S.␊ | 
| 7 - Macintosh                 8 - Z-System␊ | 
| 9 - CP/M                     10 - Windows NTFS␊ | 
| 11 - MVS (OS/390 - Z/OS)      12 - VSE␊ | 
| 13 - Acorn Risc               14 - VFAT␊ | 
| 15 - alternate MVS            16 - BeOS␊ | 
| 17 - Tandem                   18 - OS/400␊ | 
| 19 - OS/X (Darwin)            20 thru 255 - unused␊ | 
| ␊ | 
| The lower byte indicates the ZIP specification version ␊ | 
| (the version of this document) supported by the software ␊ | 
| used to encode the file.  The value/10 indicates the major ␊ | 
| version number, and the value mod 10 is the minor version ␊ | 
| number.  ␊ | 
| ␊ | 
| version needed to extract (2 bytes)␊ | 
| ␊ | 
| The minimum supported ZIP specification version needed to ␊ | 
| extract the file, mapped as above.  This value is based on ␊ | 
| the specific format features a ZIP program must support to ␊ | 
| be able to extract the file.  If multiple features are␊ | 
| applied to a file, the minimum version should be set to the ␊ | 
| feature having the highest value. New features or feature ␊ | 
| changes affecting the published format specification will be ␊ | 
| implemented using higher version numbers than the last ␊ | 
| published value to avoid conflict.␊ | 
| ␊ | 
| Current minimum feature versions are as defined below:␊ | 
| ␊ | 
| 1.0 - Default value␊ | 
| 1.1 - File is a volume label␊ | 
| 2.0 - File is a folder (directory)␊ | 
| 2.0 - File is compressed using Deflate compression␊ | 
| 2.0 - File is encrypted using traditional PKWARE encryption␊ | 
| 2.1 - File is compressed using Deflate64(tm)␊ | 
| 2.5 - File is compressed using PKWARE DCL Implode ␊ | 
| 2.7 - File is a patch data set ␊ | 
| 4.5 - File uses ZIP64 format extensions␊ | 
| 4.6 - File is compressed using BZIP2 compression*␊ | 
| 5.0 - File is encrypted using DES␊ | 
| 5.0 - File is encrypted using 3DES␊ | 
| 5.0 - File is encrypted using original RC2 encryption␊ | 
| 5.0 - File is encrypted using RC4 encryption␊ | 
| 5.1 - File is encrypted using AES encryption␊ | 
| 5.1 - File is encrypted using corrected RC2 encryption**␊ | 
| 5.2 - File is encrypted using corrected RC2-64 encryption**␊ | 
| 6.1 - File is encrypted using non-OAEP key wrapping***␊ | 
| 6.2 - Central directory encryption␊ | 
| 6.3 - File is compressed using LZMA␊ | 
| 6.3 - File is compressed using PPMd+␊ | 
| 6.3 - File is encrypted using Blowfish␊ | 
| 6.3 - File is encrypted using Twofish␊ | 
| ␊ | 
| ␊ | 
| * Early 7.x (pre-7.2) versions of PKZIP incorrectly set the␊ | 
| version needed to extract for BZIP2 compression to be 50␊ | 
| when it should have been 46.␊ | 
| ␊ | 
| ** Refer to the section on Strong Encryption Specification␊ | 
| for additional information regarding RC2 corrections.␊ | 
| ␊ | 
| *** Certificate encryption using non-OAEP key wrapping is the␊ | 
| intended mode of operation for all versions beginning with 6.1.␊ | 
| Support for OAEP key wrapping should only be used for␊ | 
| backward compatibility when sending ZIP files to be opened by␊ | 
| versions of PKZIP older than 6.1 (5.0 or 6.0).␊ | 
| ␊ | 
| + Files compressed using PPMd should set the version␊ | 
| needed to extract field to 6.3, however, not all ZIP ␊ | 
| programs enforce this and may be unable to decompress ␊ | 
| data files compressed using PPMd if this value is set.␊ | 
| ␊ | 
| When using ZIP64 extensions, the corresponding value in the␊ | 
| zip64 end of central directory record should also be set.  ␊ | 
| This field should be set appropriately to indicate whether ␊ | 
| Version 1 or Version 2 format is in use. ␊ | 
| ␊ | 
| general purpose bit flag: (2 bytes)␊ | 
| ␊ | 
| Bit 0: If set, indicates that the file is encrypted.␊ | 
| ␊ | 
| (For Method 6 - Imploding)␊ | 
| Bit 1: If the compression method used was type 6,␊ | 
| Imploding, then this bit, if set, indicates␊ | 
| an 8K sliding dictionary was used.  If clear,␊ | 
| then a 4K sliding dictionary was used.␊ | 
| Bit 2: If the compression method used was type 6,␊ | 
| Imploding, then this bit, if set, indicates␊ | 
| 3 Shannon-Fano trees were used to encode the␊ | 
| sliding dictionary output.  If clear, then 2␊ | 
| Shannon-Fano trees were used.␊ | 
| ␊ | 
| (For Methods 8 and 9 - Deflating)␊ | 
| Bit 2  Bit 1␊ | 
| 0      0    Normal (-en) compression option was used.␊ | 
| 0      1    Maximum (-exx/-ex) compression option was used.␊ | 
| 1      0    Fast (-ef) compression option was used.␊ | 
| 1      1    Super Fast (-es) compression option was used.␊ | 
| ␊ | 
| (For Method 14 - LZMA)␊ | 
| Bit 1: If the compression method used was type 14,␊ | 
| LZMA, then this bit, if set, indicates␊ | 
| an end-of-stream (EOS) marker is used to␊ | 
| mark the end of the compressed data stream.␊ | 
| If clear, then an EOS marker is not present␊ | 
| and the compressed data size must be known␊ | 
| to extract.␊ | 
| ␊ | 
| Note:  Bits 1 and 2 are undefined if the compression␊ | 
| method is any other.␊ | 
| ␊ | 
| Bit 3: If this bit is set, the fields crc-32, compressed ␊ | 
| size and uncompressed size are set to zero in the ␊ | 
| local header.  The correct values are put in the ␊ | 
| data descriptor immediately following the compressed␊ | 
| data.  (Note: PKZIP version 2.04g for DOS only ␊ | 
| recognizes this bit for method 8 compression, newer ␊ | 
| versions of PKZIP recognize this bit for any ␊ | 
| compression method.)␊ | 
| ␊ | 
| Bit 4: Reserved for use with method 8, for enhanced␊ | 
| deflating. ␊ | 
| ␊ | 
| Bit 5: If this bit is set, this indicates that the file is ␊ | 
| compressed patched data.  (Note: Requires PKZIP ␊ | 
| version 2.70 or greater)␊ | 
| ␊ | 
| Bit 6: Strong encryption.  If this bit is set, you should␊ | 
| set the version needed to extract value to at least␊ | 
| 50 and you must also set bit 0.  If AES encryption␊ | 
| is used, the version needed to extract value must ␊ | 
| be at least 51.␊ | 
| ␊ | 
| Bit 7: Currently unused.␊ | 
| ␊ | 
| Bit 8: Currently unused.␊ | 
| ␊ | 
| Bit 9: Currently unused.␊ | 
| ␊ | 
| Bit 10: Currently unused.␊ | 
| ␊ | 
| Bit 11: Language encoding flag (EFS).  If this bit is set,␊ | 
| the filename and comment fields for this file␊ | 
| must be encoded using UTF-8. (see APPENDIX D)␊ | 
| ␊ | 
| Bit 12: Reserved by PKWARE for enhanced compression.␊ | 
| ␊ | 
| Bit 13: Used when encrypting the Central Directory to indicate ␊ | 
| selected data values in the Local Header are masked to␊ | 
| hide their actual values.  See the section describing ␊ | 
| the Strong Encryption Specification for details.␊ | 
| ␊ | 
| Bit 14: Reserved by PKWARE.␊ | 
| ␊ | 
| Bit 15: Reserved by PKWARE.␊ | 
| ␊ | 
| compression method: (2 bytes)␊ | 
| ␊ | 
| (see accompanying documentation for algorithm␊ | 
| descriptions)␊ | 
| ␊ | 
| 0 - The file is stored (no compression)␊ | 
| 1 - The file is Shrunk␊ | 
| 2 - The file is Reduced with compression factor 1␊ | 
| 3 - The file is Reduced with compression factor 2␊ | 
| 4 - The file is Reduced with compression factor 3␊ | 
| 5 - The file is Reduced with compression factor 4␊ | 
| 6 - The file is Imploded␊ | 
| 7 - Reserved for Tokenizing compression algorithm␊ | 
| 8 - The file is Deflated␊ | 
| 9 - Enhanced Deflating using Deflate64(tm)␊ | 
| 10 - PKWARE Data Compression Library Imploding (old IBM TERSE)␊ | 
| 11 - Reserved by PKWARE␊ | 
| 12 - File is compressed using BZIP2 algorithm␊ | 
| 13 - Reserved by PKWARE␊ | 
| 14 - LZMA (EFS)␊ | 
| 15 - Reserved by PKWARE␊ | 
| 16 - Reserved by PKWARE␊ | 
| 17 - Reserved by PKWARE␊ | 
| 18 - File is compressed using IBM TERSE (new)␊ | 
| 19 - IBM LZ77 z Architecture (PFS)␊ | 
| 98 - PPMd version I, Rev 1␊ | 
| ␊ | 
| date and time fields: (2 bytes each)␊ | 
| ␊ | 
| The date and time are encoded in standard MS-DOS format.␊ | 
| If input came from standard input, the date and time are␊ | 
| those at which compression was started for this data. ␊ | 
| If encrypting the central directory and general purpose bit ␊ | 
| flag 13 is set indicating masking, the value stored in the ␊ | 
| Local Header will be zero. ␊ | 
| ␊ | 
| CRC-32: (4 bytes)␊ | 
| ␊ | 
| The CRC-32 algorithm was generously contributed by␊ | 
| David Schwaderer and can be found in his excellent␊ | 
| book "C Programmers Guide to NetBIOS" published by␊ | 
| Howard W. Sams & Co. Inc.  The 'magic number' for␊ | 
| the CRC is 0xdebb20e3.  The proper CRC pre and post␊ | 
| conditioning is used, meaning that the CRC register␊ | 
| is pre-conditioned with all ones (a starting value␊ | 
| of 0xffffffff) and the value is post-conditioned by␊ | 
| taking the one's complement of the CRC residual.␊ | 
| If bit 3 of the general purpose flag is set, this␊ | 
| field is set to zero in the local header and the correct␊ | 
| value is put in the data descriptor and in the central␊ | 
| directory. When encrypting the central directory, if the␊ | 
| local header is not in ZIP64 format and general purpose ␊ | 
| bit flag 13 is set indicating masking, the value stored ␊ | 
| in the Local Header will be zero. ␊ | 
| ␊ | 
| compressed size: (4 bytes)␊ | 
| uncompressed size: (4 bytes)␊ | 
| ␊ | 
| The size of the file compressed and uncompressed,␊ | 
| respectively.  When a decryption header is present it will␊ | 
| be placed in front of the file data and the value of the␊ | 
| compressed file size will include the bytes of the decryption␊ | 
| header.  If bit 3 of the general purpose bit flag is set, ␊ | 
| these fields are set to zero in the local header and the ␊ | 
| correct values are put in the data descriptor and␊ | 
| in the central directory.  If an archive is in ZIP64 format␊ | 
| and the value in this field is 0xFFFFFFFF, the size will be␊ | 
| in the corresponding 8 byte ZIP64 extended information ␊ | 
| extra field.  When encrypting the central directory, if the␊ | 
| local header is not in ZIP64 format and general purpose bit ␊ | 
| flag 13 is set indicating masking, the value stored for the ␊ | 
| uncompressed size in the Local Header will be zero. ␊ | 
| ␊ | 
| file name length: (2 bytes)␊ | 
| extra field length: (2 bytes)␊ | 
| file comment length: (2 bytes)␊ | 
| ␊ | 
| The length of the file name, extra field, and comment␊ | 
| fields respectively.  The combined length of any␊ | 
| directory record and these three fields should not␊ | 
| generally exceed 65,535 bytes.  If input came from standard␊ | 
| input, the file name length is set to zero.  ␊ | 
| ␊ | 
| disk number start: (2 bytes)␊ | 
| ␊ | 
| The number of the disk on which this file begins.  If an ␊ | 
| archive is in ZIP64 format and the value in this field is ␊ | 
| 0xFFFF, the size will be in the corresponding 4 byte zip64 ␊ | 
| extended information extra field.␊ | 
| ␊ | 
| internal file attributes: (2 bytes)␊ | 
| ␊ | 
| Bits 1 and 2 are reserved for use by PKWARE.␊ | 
| ␊ | 
| The lowest bit of this field indicates, if set, that␊ | 
| the file is apparently an ASCII or text file.  If not␊ | 
| set, that the file apparently contains binary data.␊ | 
| The remaining bits are unused in version 1.0.␊ | 
| ␊ | 
| The 0x0002 bit of this field indicates, if set, that a ␊ | 
| 4 byte variable record length control field precedes each ␊ | 
| logical record indicating the length of the record. The ␊ | 
| record length control field is stored in little-endian byte␊ | 
| order.  This flag is independent of text control characters, ␊ | 
| and if used in conjunction with text data, includes any ␊ | 
| control characters in the total length of the record. This ␊ | 
| value is provided for mainframe data transfer support.␊ | 
| ␊ | 
| external file attributes: (4 bytes)␊ | 
| ␊ | 
| The mapping of the external attributes is␊ | 
| host-system dependent (see 'version made by').  For␊ | 
| MS-DOS, the low order byte is the MS-DOS directory␊ | 
| attribute byte.  If input came from standard input, this␊ | 
| field is set to zero.␊ | 
| ␊ | 
| relative offset of local header: (4 bytes)␊ | 
| ␊ | 
| This is the offset from the start of the first disk on␊ | 
| which this file appears, to where the local header should␊ | 
| be found.  If an archive is in ZIP64 format and the value␊ | 
| in this field is 0xFFFFFFFF, the size will be in the ␊ | 
| corresponding 8 byte zip64 extended information extra field.␊ | 
| ␊ | 
| file name: (Variable)␊ | 
| ␊ | 
| The name of the file, with optional relative path.␊ | 
| The path stored should not contain a drive or␊ | 
| device letter, or a leading slash.  All slashes␊ | 
| should be forward slashes '/' as opposed to␊ | 
| backwards slashes '\' for compatibility with Amiga␊ | 
| and UNIX file systems etc.  If input came from standard␊ | 
| input, there is no file name field.  If encrypting␊ | 
| the central directory and general purpose bit flag 13 is set ␊ | 
| indicating masking, the file name stored in the Local Header ␊ | 
| will not be the actual file name.  A masking value consisting ␊ | 
| of a unique hexadecimal value will be stored.  This value will ␊ | 
| be sequentially incremented for each file in the archive. See␊ | 
| the section on the Strong Encryption Specification for details ␊ | 
| on retrieving the encrypted file name. ␊ | 
| ␊ | 
| extra field: (Variable)␊ | 
| ␊ | 
| This is for expansion.  If additional information␊ | 
| needs to be stored for special needs or for specific ␊ | 
| platforms, it should be stored here.  Earlier versions ␊ | 
| of the software can then safely skip this file, and ␊ | 
| find the next file or header.  This field will be 0 ␊ | 
| length in version 1.0.␊ | 
| ␊ | 
| In order to allow different programs and different types␊ | 
| of information to be stored in the 'extra' field in .ZIP␊ | 
| files, the following structure should be used for all␊ | 
| programs storing data in this field:␊ | 
| ␊ | 
| header1+data1 + header2+data2 . . .␊ | 
| ␊ | 
| Each header should consist of:␊ | 
| ␊ | 
| Header ID - 2 bytes␊ | 
| Data Size - 2 bytes␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| The Header ID field indicates the type of data that is in␊ | 
| the following data block.␊ | 
| ␊ | 
| Header ID's of 0 thru 31 are reserved for use by PKWARE.␊ | 
| The remaining ID's can be used by third party vendors for␊ | 
| proprietary usage.␊ | 
| ␊ | 
| The current Header ID mappings defined by PKWARE are:␊ | 
| ␊ | 
| 0x0001        Zip64 extended information extra field␊ | 
| 0x0007        AV Info␊ | 
| 0x0008        Reserved for extended language encoding data (PFS)␊ | 
| (see APPENDIX D)␊ | 
| 0x0009        OS/2␊ | 
| 0x000a        NTFS ␊ | 
| 0x000c        OpenVMS␊ | 
| 0x000d        UNIX␊ | 
| 0x000e        Reserved for file stream and fork descriptors␊ | 
| 0x000f        Patch Descriptor␊ | 
| 0x0014        PKCS#7 Store for X.509 Certificates␊ | 
| 0x0015        X.509 Certificate ID and Signature for ␊ | 
| individual file␊ | 
| 0x0016        X.509 Certificate ID for Central Directory␊ | 
| 0x0017        Strong Encryption Header␊ | 
| 0x0018        Record Management Controls␊ | 
| 0x0019        PKCS#7 Encryption Recipient Certificate List␊ | 
| 0x0065        IBM S/390 (Z390), AS/400 (I400) attributes ␊ | 
| - uncompressed␊ | 
| 0x0066        Reserved for IBM S/390 (Z390), AS/400 (I400) ␊ | 
| attributes - compressed␊ | 
| 0x4690        POSZIP 4690 (reserved) ␊ | 
| ␊ | 
| Third party mappings commonly used are:␊ | 
| ␊ | 
| ␊ | 
| 0x07c8        Macintosh␊ | 
| 0x2605        ZipIt Macintosh␊ | 
| 0x2705        ZipIt Macintosh 1.3.5+␊ | 
| 0x2805        ZipIt Macintosh 1.3.5+␊ | 
| 0x334d        Info-ZIP Macintosh␊ | 
| 0x4341        Acorn/SparkFS ␊ | 
| 0x4453        Windows NT security descriptor (binary ACL)␊ | 
| 0x4704        VM/CMS␊ | 
| 0x470f        MVS␊ | 
| 0x4b46        FWKCS MD5 (see below)␊ | 
| 0x4c41        OS/2 access control list (text ACL)␊ | 
| 0x4d49        Info-ZIP OpenVMS␊ | 
| 0x4f4c        Xceed original location extra field␊ | 
| 0x5356        AOS/VS (ACL)␊ | 
| 0x5455        extended timestamp␊ | 
| 0x554e        Xceed unicode extra field␊ | 
| 0x5855        Info-ZIP UNIX (original, also OS/2, NT, etc)␊ | 
| 0x6542        BeOS/BeBox␊ | 
| 0x756e        ASi UNIX␊ | 
| 0x7855        Info-ZIP UNIX (new)␊ | 
| 0xa220        Microsoft Open Packaging Growth Hint␊ | 
| 0xfd4a        SMS/QDOS␊ | 
| ␊ | 
| Detailed descriptions of Extra Fields defined by third ␊ | 
| party mappings will be documented as information on␊ | 
| these data structures is made available to PKWARE.  ␊ | 
| PKWARE does not guarantee the accuracy of any published␊ | 
| third party data.␊ | 
| ␊ | 
| The Data Size field indicates the size of the following␊ | 
| data block. Programs can use this value to skip to the␊ | 
| next header block, passing over any data blocks that are␊ | 
| not of interest.␊ | 
| ␊ | 
| Note: As stated above, the size of the entire .ZIP file␊ | 
| header, including the file name, comment, and extra␊ | 
| field should not exceed 64K in size.␊ | 
| ␊ | 
| In case two different programs should appropriate the same␊ | 
| Header ID value, it is strongly recommended that each␊ | 
| program place a unique signature of at least two bytes in␊ | 
| size (and preferably 4 bytes or bigger) at the start of␊ | 
| each data area.  Every program should verify that its␊ | 
| unique signature is present, in addition to the Header ID␊ | 
| value being correct, before assuming that it is a block of␊ | 
| known type.␊ | 
| ␊ | 
| -Zip64 Extended Information Extra Field (0x0001):␊ | 
| ␊ | 
| The following is the layout of the zip64 extended ␊ | 
| information "extra" block. If one of the size or␊ | 
| offset fields in the Local or Central directory␊ | 
| record is too small to hold the required data,␊ | 
| a Zip64 extended information record is created.␊ | 
| The order of the fields in the zip64 extended ␊ | 
| information record is fixed, but the fields will␊ | 
| only appear if the corresponding Local or Central␊ | 
| directory record field is set to 0xFFFF or 0xFFFFFFFF.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value      Size       Description␊ | 
| -----      ----       -----------␊ | 
| (ZIP64) 0x0001     2 bytes    Tag for this "extra" block type␊ | 
| Size       2 bytes    Size of this "extra" block␊ | 
| Original ␊ | 
| Size       8 bytes    Original uncompressed file size␊ | 
| Compressed␊ | 
| Size       8 bytes    Size of compressed data␊ | 
| Relative Header␊ | 
| Offset     8 bytes    Offset of local header record␊ | 
| Disk Start␊ | 
| Number     4 bytes    Number of the disk on which␊ | 
| this file starts ␊ | 
| ␊ | 
| This entry in the Local header must include BOTH original␊ | 
| and compressed file size fields. If encrypting the ␊ | 
| central directory and bit 13 of the general purpose bit␊ | 
| flag is set indicating masking, the value stored in the␊ | 
| Local Header for the original file size will be zero.␊ | 
| ␊ | 
| ␊ | 
| -OS/2 Extra Field (0x0009):␊ | 
| ␊ | 
| The following is the layout of the OS/2 attributes "extra" ␊ | 
| block.  (Last Revision  09/05/95)␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value       Size          Description␊ | 
| -----       ----          -----------␊ | 
| (OS/2)  0x0009      2 bytes       Tag for this "extra" block type␊ | 
| TSize       2 bytes       Size for the following data block␊ | 
| BSize       4 bytes       Uncompressed Block Size␊ | 
| CType       2 bytes       Compression type␊ | 
| EACRC       4 bytes       CRC value for uncompress block␊ | 
| (var)       variable      Compressed block␊ | 
| ␊ | 
| The OS/2 extended attribute structure (FEA2LIST) is ␊ | 
| compressed and then stored in it's entirety within this ␊ | 
| structure.  There will only ever be one "block" of data in ␊ | 
| VarFields[].␊ | 
| ␊ | 
| -NTFS Extra Field (0x000a):␊ | 
| ␊ | 
| The following is the layout of the NTFS attributes ␊ | 
| "extra" block. (Note: At this time the Mtime, Atime␊ | 
| and Ctime values may be used on any WIN32 system.)  ␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value      Size       Description␊ | 
| -----      ----       -----------␊ | 
| (NTFS)  0x000a     2 bytes    Tag for this "extra" block type␊ | 
| TSize      2 bytes    Size of the total "extra" block␊ | 
| Reserved   4 bytes    Reserved for future use␊ | 
| Tag1       2 bytes    NTFS attribute tag value #1␊ | 
| Size1      2 bytes    Size of attribute #1, in bytes␊ | 
| (var.)     Size1      Attribute #1 data␊ | 
| .␊ | 
| .␊ | 
| .␊ | 
| TagN       2 bytes    NTFS attribute tag value #N␊ | 
| SizeN      2 bytes    Size of attribute #N, in bytes␊ | 
| (var.)     SizeN      Attribute #N data␊ | 
| ␊ | 
| For NTFS, values for Tag1 through TagN are as follows:␊ | 
| (currently only one set of attributes is defined for NTFS)␊ | 
| ␊ | 
| Tag        Size       Description␊ | 
| -----      ----       -----------␊ | 
| 0x0001     2 bytes    Tag for attribute #1 ␊ | 
| Size1      2 bytes    Size of attribute #1, in bytes␊ | 
| Mtime      8 bytes    File last modification time␊ | 
| Atime      8 bytes    File last access time␊ | 
| Ctime      8 bytes    File creation time␊ | 
| ␊ | 
| -OpenVMS Extra Field (0x000c):␊ | 
| ␊ | 
| The following is the layout of the OpenVMS attributes ␊ | 
| "extra" block.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value      Size       Description␊ | 
| -----      ----       -----------␊ | 
| (VMS)   0x000c     2 bytes    Tag for this "extra" block type␊ | 
| TSize      2 bytes    Size of the total "extra" block␊ | 
| CRC        4 bytes    32-bit CRC for remainder of the block␊ | 
| Tag1       2 bytes    OpenVMS attribute tag value #1␊ | 
| Size1      2 bytes    Size of attribute #1, in bytes␊ | 
| (var.)     Size1      Attribute #1 data␊ | 
| .␊ | 
| .␊ | 
| .␊ | 
| TagN       2 bytes    OpenVMS attribute tag value #N␊ | 
| SizeN      2 bytes    Size of attribute #N, in bytes␊ | 
| (var.)     SizeN      Attribute #N data␊ | 
| ␊ | 
| Rules:␊ | 
| ␊ | 
| 1. There will be one or more of attributes present, which ␊ | 
| will each be preceded by the above TagX & SizeX values.  ␊ | 
| These values are identical to the ATR$C_XXXX and ␊ | 
| ATR$S_XXXX constants which are defined in ATR.H under ␊ | 
| OpenVMS C.  Neither of these values will ever be zero.␊ | 
| ␊ | 
| 2. No word alignment or padding is performed.␊ | 
| ␊ | 
| 3. A well-behaved PKZIP/OpenVMS program should never produce␊ | 
| more than one sub-block with the same TagX value.  Also,␊ | 
| there will never be more than one "extra" block of type␊ | 
| 0x000c in a particular directory record.␊ | 
| ␊ | 
| -UNIX Extra Field (0x000d):␊ | 
| ␊ | 
| The following is the layout of the UNIX "extra" block.␊ | 
| Note: all fields are stored in Intel low-byte/high-byte ␊ | 
| order.␊ | 
| ␊ | 
| Value       Size          Description␊ | 
| -----       ----          -----------␊ | 
| (UNIX)  0x000d      2 bytes       Tag for this "extra" block type␊ | 
| TSize       2 bytes       Size for the following data block␊ | 
| Atime       4 bytes       File last access time␊ | 
| Mtime       4 bytes       File last modification time␊ | 
| Uid         2 bytes       File user ID␊ | 
| Gid         2 bytes       File group ID␊ | 
| (var)       variable      Variable length data field␊ | 
| ␊ | 
| The variable length data field will contain file type ␊ | 
| specific data.  Currently the only values allowed are␊ | 
| the original "linked to" file names for hard or symbolic ␊ | 
| links, and the major and minor device node numbers for␊ | 
| character and block device nodes.  Since device nodes␊ | 
| cannot be either symbolic or hard links, only one set of␊ | 
| variable length data is stored.  Link files will have the␊ | 
| name of the original file stored.  This name is NOT NULL␊ | 
| terminated.  Its size can be determined by checking TSize -␊ | 
| 12.  Device entries will have eight bytes stored as two 4␊ | 
| byte entries (in little endian format).  The first entry␊ | 
| will be the major device number, and the second the minor␊ | 
| device number.␊ | 
| ␊ | 
| -PATCH Descriptor Extra Field (0x000f):␊ | 
| ␊ | 
| The following is the layout of the Patch Descriptor "extra"␊ | 
| block.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (Patch) 0x000f    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of the total "extra" block␊ | 
| Version   2 bytes  Version of the descriptor␊ | 
| Flags     4 bytes  Actions and reactions (see below) ␊ | 
| OldSize   4 bytes  Size of the file about to be patched ␊ | 
| OldCRC    4 bytes  32-bit CRC of the file to be patched ␊ | 
| NewSize   4 bytes  Size of the resulting file ␊ | 
| NewCRC    4 bytes  32-bit CRC of the resulting file ␊ | 
| ␊ | 
| Actions and reactions␊ | 
| ␊ | 
| Bits          Description␊ | 
| ----          ----------------␊ | 
| 0             Use for auto detection␊ | 
| 1             Treat as a self-patch␊ | 
| 2-3           RESERVED␊ | 
| 4-5           Action (see below)␊ | 
| 6-7           RESERVED␊ | 
| 8-9           Reaction (see below) to absent file ␊ | 
| 10-11         Reaction (see below) to newer file␊ | 
| 12-13         Reaction (see below) to unknown file␊ | 
| 14-15         RESERVED␊ | 
| 16-31         RESERVED␊ | 
| ␊ | 
| Actions␊ | 
| ␊ | 
| Action       Value␊ | 
| ------       ----- ␊ | 
| none         0␊ | 
| add          1␊ | 
| delete       2␊ | 
| patch        3␊ | 
| ␊ | 
| Reactions␊ | 
| ␊ | 
| Reaction     Value␊ | 
| --------     -----␊ | 
| ask          0␊ | 
| skip         1␊ | 
| ignore       2␊ | 
| fail         3␊ | 
| ␊ | 
| Patch support is provided by PKPatchMaker(tm) technology and is ␊ | 
| covered under U.S. Patents and Patents Pending. The use or ␊ | 
| implementation in a product of certain technological aspects set␊ | 
| forth in the current APPNOTE, including those with regard to ␊ | 
| strong encryption, patching, or extended tape operations requires␊ | 
| a license from PKWARE.  Please contact PKWARE with regard to ␊ | 
| acquiring a license. ␊ | 
| ␊ | 
| -PKCS#7 Store for X.509 Certificates (0x0014):␊ | 
| ␊ | 
| This field contains information about each of the certificates ␊ | 
| files may be signed with. When the Central Directory Encryption ␊ | 
| feature is enabled for a ZIP file, this record will appear in ␊ | 
| the Archive Extra Data Record, otherwise it will appear in the ␊ | 
| first central directory record and will be ignored in any ␊ | 
| other record.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (Store) 0x0014    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of the store data␊ | 
| TData     TSize    Data about the store␊ | 
| ␊ | 
| ␊ | 
| -X.509 Certificate ID and Signature for individual file (0x0015):␊ | 
| ␊ | 
| This field contains the information about which certificate in ␊ | 
| the PKCS#7 store was used to sign a particular file. It also ␊ | 
| contains the signature data. This field can appear multiple ␊ | 
| times, but can only appear once per certificate.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (CID)   0x0015    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of data that follows␊ | 
| TData     TSize    Signature Data␊ | 
| ␊ | 
| -X.509 Certificate ID and Signature for central directory (0x0016):␊ | 
| ␊ | 
| This field contains the information about which certificate in ␊ | 
| the PKCS#7 store was used to sign the central directory structure.␊ | 
| When the Central Directory Encryption feature is enabled for a ␊ | 
| ZIP file, this record will appear in the Archive Extra Data Record, ␊ | 
| otherwise it will appear in the first central directory record.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (CDID)  0x0016    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of data that follows␊ | 
| TData     TSize    Data␊ | 
| ␊ | 
| -Strong Encryption Header (0x0017):␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| 0x0017    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of data that follows␊ | 
| Format    2 bytes  Format definition for this record␊ | 
| AlgID     2 bytes  Encryption algorithm identifier␊ | 
| Bitlen    2 bytes  Bit length of encryption key␊ | 
| Flags     2 bytes  Processing flags␊ | 
| CertData  TSize-8  Certificate decryption extra field data␊ | 
| (refer to the explanation for CertData␊ | 
| in the section describing the ␊ | 
| Certificate Processing Method under ␊ | 
| the Strong Encryption Specification)␊ | 
| ␊ | 
| ␊ | 
| -Record Management Controls (0x0018):␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (Rec-CTL) 0x0018    2 bytes  Tag for this "extra" block type␊ | 
| CSize     2 bytes  Size of total extra block data␊ | 
| Tag1      2 bytes  Record control attribute 1␊ | 
| Size1     2 bytes  Size of attribute 1, in bytes␊ | 
| Data1     Size1    Attribute 1 data␊ | 
| .␊ | 
| .␊ | 
| .␊ | 
| TagN      2 bytes  Record control attribute N␊ | 
| SizeN     2 bytes  Size of attribute N, in bytes␊ | 
| DataN     SizeN    Attribute N data␊ | 
| ␊ | 
| ␊ | 
| -PKCS#7 Encryption Recipient Certificate List (0x0019): ␊ | 
| ␊ | 
| This field contains information about each of the certificates␊ | 
| used in encryption processing and it can be used to identify who is␊ | 
| allowed to decrypt encrypted files.  This field should only appear ␊ | 
| in the archive extra data record. This field is not required and ␊ | 
| serves only to aide archive modifications by preserving public ␊ | 
| encryption key data. Individual security requirements may dictate ␊ | 
| that this data be omitted to deter information exposure.␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| (CStore) 0x0019    2 bytes  Tag for this "extra" block type␊ | 
| TSize     2 bytes  Size of the store data␊ | 
| TData     TSize    Data about the store␊ | 
| ␊ | 
| TData:␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| Version   2 bytes  Format version number - must 0x0001 at this time␊ | 
| CStore    (var)    PKCS#7 data blob␊ | 
| ␊ | 
| ␊ | 
| -MVS Extra Field (0x0065):␊ | 
| ␊ | 
| The following is the layout of the MVS "extra" block.␊ | 
| Note: Some fields are stored in Big Endian format.␊ | 
| All text is in EBCDIC format unless otherwise specified.␊ | 
| ␊ | 
| Value       Size          Description␊ | 
| -----       ----          -----------␊ | 
| (MVS)   0x0065      2 bytes       Tag for this "extra" block type␊ | 
| TSize       2 bytes       Size for the following data block␊ | 
| ID          4 bytes       EBCDIC "Z390" 0xE9F3F9F0 or␊ | 
| "T4MV" for TargetFour␊ | 
| (var)       TSize-4       Attribute data (see APPENDIX B)␊ | 
| ␊ | 
| ␊ | 
| -OS/400 Extra Field (0x0065):␊ | 
| ␊ | 
| The following is the layout of the OS/400 "extra" block.␊ | 
| Note: Some fields are stored in Big Endian format.␊ | 
| All text is in EBCDIC format unless otherwise specified.␊ | 
| ␊ | 
| Value       Size          Description␊ | 
| -----       ----          -----------␊ | 
| (OS400) 0x0065      2 bytes       Tag for this "extra" block type␊ | 
| TSize       2 bytes       Size for the following data block␊ | 
| ID          4 bytes       EBCDIC "I400" 0xC9F4F0F0 or␊ | 
| "T4MV" for TargetFour␊ | 
| (var)       TSize-4       Attribute data (see APPENDIX A)␊ | 
| ␊ | 
| ␊ | 
| Third-party Mappings:␊ | 
| ␊ | 
| -ZipIt Macintosh Extra Field (long) (0x2605):␊ | 
| ␊ | 
| The following is the layout of the ZipIt extra block ␊ | 
| for Macintosh. The local-header and central-header versions ␊ | 
| are identical. This block must be present if the file is ␊ | 
| stored MacBinary-encoded and it should not be used if the file ␊ | 
| is not stored MacBinary-encoded.␊ | 
| ␊ | 
| Value         Size        Description␊ | 
| -----         ----        -----------␊ | 
| (Mac2)  0x2605        Short       tag for this extra block type␊ | 
| TSize         Short       total data size for this block␊ | 
| "ZPIT"        beLong      extra-field signature␊ | 
| FnLen         Byte        length of FileName␊ | 
| FileName      variable    full Macintosh filename␊ | 
| FileType      Byte[4]     four-byte Mac file type string␊ | 
| Creator       Byte[4]     four-byte Mac creator string␊ | 
| ␊ | 
| ␊ | 
| -ZipIt Macintosh Extra Field (short, for files) (0x2705):␊ | 
| ␊ | 
| The following is the layout of a shortened variant of the␊ | 
| ZipIt extra block for Macintosh (without "full name" entry).␊ | 
| This variant is used by ZipIt 1.3.5 and newer for entries of␊ | 
| files (not directories) that do not have a MacBinary encoded␊ | 
| file. The local-header and central-header versions are identical.␊ | 
| ␊ | 
| Value         Size        Description␊ | 
| -----         ----        -----------␊ | 
| (Mac2b) 0x2705        Short       tag for this extra block type␊ | 
| TSize         Short       total data size for this block (12)␊ | 
| "ZPIT"        beLong      extra-field signature␊ | 
| FileType      Byte[4]     four-byte Mac file type string␊ | 
| Creator       Byte[4]     four-byte Mac creator string␊ | 
| fdFlags       beShort     attributes from FInfo.frFlags,␊ | 
| may be omitted␊ | 
| 0x0000        beShort     reserved, may be omitted␊ | 
| ␊ | 
| ␊ | 
| -ZipIt Macintosh Extra Field (short, for directories) (0x2805):␊ | 
| ␊ | 
| The following is the layout of a shortened variant of the␊ | 
| ZipIt extra block for Macintosh used only for directory␊ | 
| entries. This variant is used by ZipIt 1.3.5 and newer to ␊ | 
| save some optional Mac-specific information about directories.␊ | 
| The local-header and central-header versions are identical.␊ | 
| ␊ | 
| Value         Size        Description␊ | 
| -----         ----        -----------␊ | 
| (Mac2c) 0x2805        Short       tag for this extra block type␊ | 
| TSize         Short       total data size for this block (12)␊ | 
| "ZPIT"        beLong      extra-field signature␊ | 
| frFlags       beShort     attributes from DInfo.frFlags, may␊ | 
| be omitted␊ | 
| View          beShort     ZipIt view flag, may be omitted␊ | 
| ␊ | 
| ␊ | 
| The View field specifies ZipIt-internal settings as follows:␊ | 
| ␊ | 
| Bits of the Flags:␊ | 
| bit 0           if set, the folder is shown expanded (open)␊ | 
| when the archive contents are viewed in ZipIt.␊ | 
| bits 1-15       reserved, zero;␊ | 
| ␊ | 
| ␊ | 
| -FWKCS MD5 Extra Field (0x4b46):␊ | 
| ␊ | 
| The FWKCS Contents_Signature System, used in␊ | 
| automatically identifying files independent of file name,␊ | 
| optionally adds and uses an extra field to support the␊ | 
| rapid creation of an enhanced contents_signature:␊ | 
| ␊ | 
| Header ID = 0x4b46␊ | 
| Data Size = 0x0013␊ | 
| Preface   = 'M','D','5'␊ | 
| followed by 16 bytes containing the uncompressed file's␊ | 
| 128_bit MD5 hash(1), low byte first.␊ | 
| ␊ | 
| When FWKCS revises a .ZIP file central directory to add␊ | 
| this extra field for a file, it also replaces the␊ | 
| central directory entry for that file's uncompressed␊ | 
| file length with a measured value.␊ | 
| ␊ | 
| FWKCS provides an option to strip this extra field, if␊ | 
| present, from a .ZIP file central directory. In adding␊ | 
| this extra field, FWKCS preserves .ZIP file Authenticity␊ | 
| Verification; if stripping this extra field, FWKCS␊ | 
| preserves all versions of AV through PKZIP version 2.04g.␊ | 
| ␊ | 
| FWKCS, and FWKCS Contents_Signature System, are␊ | 
| trademarks of Frederick W. Kantor.␊ | 
| ␊ | 
| (1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer␊ | 
| Science and RSA Data Security, Inc., April 1992.␊ | 
| ll.76-77: "The MD5 algorithm is being placed in the␊ | 
| public domain for review and possible adoption as a␊ | 
| standard."␊ | 
| ␊ | 
| -Microsoft Open Packaging Growth Hint (0xa220):␊ | 
| ␊ | 
| Value         Size        Description␊ | 
| -----         ----        -----------␊ | 
| 0xa220        Short       tag for this extra block type␊ | 
| TSize         Short       size of Sig + PadVal + Padding␊ | 
| Sig           Short       verification signature (A028)␊ | 
| PadVal        Short       Initial padding value␊ | 
| Padding       variable    filled with NULL characters␊ | 
| ␊ | 
| ␊ | 
| file comment: (Variable)␊ | 
| ␊ | 
| The comment for this file.␊ | 
| ␊ | 
| number of this disk: (2 bytes)␊ | 
| ␊ | 
| The number of this disk, which contains central␊ | 
| directory end record. If an archive is in ZIP64 format␊ | 
| and the value in this field is 0xFFFF, the size will ␊ | 
| be in the corresponding 4 byte zip64 end of central ␊ | 
| directory field.␊ | 
| ␊ | 
| ␊ | 
| number of the disk with the start of the central␊ | 
| directory: (2 bytes)␊ | 
| ␊ | 
| The number of the disk on which the central␊ | 
| directory starts. If an archive is in ZIP64 format␊ | 
| and the value in this field is 0xFFFF, the size will ␊ | 
| be in the corresponding 4 byte zip64 end of central ␊ | 
| directory field.␊ | 
| ␊ | 
| total number of entries in the central dir on ␊ | 
| this disk: (2 bytes)␊ | 
| ␊ | 
| The number of central directory entries on this disk.␊ | 
| If an archive is in ZIP64 format and the value in ␊ | 
| this field is 0xFFFF, the size will be in the ␊ | 
| corresponding 8 byte zip64 end of central ␊ | 
| directory field.␊ | 
| ␊ | 
| total number of entries in the central dir: (2 bytes)␊ | 
| ␊ | 
| The total number of files in the .ZIP file. If an ␊ | 
| archive is in ZIP64 format and the value in this field␊ | 
| is 0xFFFF, the size will be in the corresponding 8 byte ␊ | 
| zip64 end of central directory field.␊ | 
| ␊ | 
| size of the central directory: (4 bytes)␊ | 
| ␊ | 
| The size (in bytes) of the entire central directory.␊ | 
| If an archive is in ZIP64 format and the value in ␊ | 
| this field is 0xFFFFFFFF, the size will be in the ␊ | 
| corresponding 8 byte zip64 end of central ␊ | 
| directory field.␊ | 
| ␊ | 
| offset of start of central directory with respect to␊ | 
| the starting disk number:  (4 bytes)␊ | 
| ␊ | 
| Offset of the start of the central directory on the␊ | 
| disk on which the central directory starts. If an ␊ | 
| archive is in ZIP64 format and the value in this ␊ | 
| field is 0xFFFFFFFF, the size will be in the ␊ | 
| corresponding 8 byte zip64 end of central ␊ | 
| directory field.␊ | 
| ␊ | 
| .ZIP file comment length: (2 bytes)␊ | 
| ␊ | 
| The length of the comment for this .ZIP file.␊ | 
| ␊ | 
| .ZIP file comment: (Variable)␊ | 
| ␊ | 
| The comment for this .ZIP file.  ZIP file comment data␊ | 
| is stored unsecured.  No encryption or data authentication␊ | 
| is applied to this area at this time.  Confidential information␊ | 
| should not be stored in this section.␊ | 
| ␊ | 
| zip64 extensible data sector    (variable size)␊ | 
| ␊ | 
| (currently reserved for use by PKWARE)␊ | 
| ␊ | 
| ␊ | 
| K.  Splitting and Spanning ZIP files␊ | 
| ␊ | 
| Spanning is the process of segmenting a ZIP file across ␊ | 
| multiple removable media. This support has typically only ␊ | 
| been provided for DOS formatted floppy diskettes. ␊ | 
| ␊ | 
| File splitting is a newer derivative of spanning.  ␊ | 
| Splitting follows the same segmentation process as␊ | 
| spanning, however, it does not require writing each␊ | 
| segment to a unique removable medium and instead supports␊ | 
| placing all pieces onto local or non-removable locations␊ | 
| such as file systems, local drives, folders, etc...␊ | 
| ␊ | 
| A key difference between spanned and split ZIP files is␊ | 
| that all pieces of a spanned ZIP file have the same name.  ␊ | 
| Since each piece is written to a separate volume, no name ␊ | 
| collisions occur and each segment can reuse the original ␊ | 
| .ZIP file name given to the archive.␊ | 
| ␊ | 
| Sequence ordering for DOS spanned archives uses the DOS ␊ | 
| volume label to determine segment numbers.  Volume labels␊ | 
| for each segment are written using the form PKBACK#xxx, ␊ | 
| where xxx is the segment number written as a decimal ␊ | 
| value from 001 - nnn.␊ | 
| ␊ | 
| Split ZIP files are typically written to the same location␊ | 
| and are subject to name collisions if the spanned name␊ | 
| format is used since each segment will reside on the same ␊ | 
| drive. To avoid name collisions, split archives are named ␊ | 
| as follows.␊ | 
| ␊ | 
| Segment 1   = filename.z01␊ | 
| Segment n-1 = filename.z(n-1)␊ | 
| Segment n   = filename.zip␊ | 
| ␊ | 
| The .ZIP extension is used on the last segment to support␊ | 
| quickly reading the central directory.  The segment number␊ | 
| n should be a decimal value.␊ | 
| ␊ | 
| Spanned ZIP files may be PKSFX Self-extracting ZIP files.␊ | 
| PKSFX files may also be split, however, in this case␊ | 
| the first segment must be named filename.exe.  The first␊ | 
| segment of a split PKSFX archive must be large enough to␊ | 
| include the entire executable program.␊ | 
| ␊ | 
| Capacities for split archives are as follows.␊ | 
| ␊ | 
| Maximum number of segments = 4,294,967,295 - 1␊ | 
| Maximum .ZIP segment size = 4,294,967,295 bytes␊ | 
| Minimum segment size = 64K␊ | 
| Maximum PKSFX segment size = 2,147,483,647 bytes␊ | 
| ␊ | 
| Segment sizes may be different however by convention, all ␊ | 
| segment sizes should be the same with the exception of the ␊ | 
| last, which may be smaller.  Local and central directory ␊ | 
| header records must never be split across a segment boundary. ␊ | 
| When writing a header record, if the number of bytes remaining ␊ | 
| within a segment is less than the size of the header record,␊ | 
| end the current segment and write the header at the start␊ | 
| of the next segment.  The central directory may span segment␊ | 
| boundaries, but no single record in the central directory␊ | 
| should be split across segments.␊ | 
| ␊ | 
| Spanned/Split archives created using PKZIP for Windows␊ | 
| (V2.50 or greater), PKZIP Command Line (V2.50 or greater),␊ | 
| or PKZIP Explorer will include a special spanning ␊ | 
| signature as the first 4 bytes of the first segment of␊ | 
| the archive.  This signature (0x08074b50) will be ␊ | 
| followed immediately by the local header signature for␊ | 
| the first file in the archive.  ␊ | 
| ␊ | 
| A special spanning marker may also appear in spanned/split ␊ | 
| archives if the spanning or splitting process starts but ␊ | 
| only requires one segment.  In this case the 0x08074b50 ␊ | 
| signature will be replaced with the temporary spanning ␊ | 
| marker signature of 0x30304b50.  Split archives can␊ | 
| only be uncompressed by other versions of PKZIP that␊ | 
| know how to create a split archive.␊ | 
| ␊ | 
| The signature value 0x08074b50 is also used by some␊ | 
| ZIP implementations as a marker for the Data Descriptor ␊ | 
| record.  Conflict in this alternate assignment can be␊ | 
| avoided by ensuring the position of the signature␊ | 
| within the ZIP file to determine the use for which it␊ | 
| is intended.  ␊ | 
| ␊ | 
| L.  General notes:␊ | 
| ␊ | 
| 1)  All fields unless otherwise noted are unsigned and stored␊ | 
| in Intel low-byte:high-byte, low-word:high-word order.␊ | 
| ␊ | 
| 2)  String fields are not null terminated, since the␊ | 
| length is given explicitly.␊ | 
| ␊ | 
| 3)  The entries in the central directory may not necessarily␊ | 
| be in the same order that files appear in the .ZIP file.␊ | 
| ␊ | 
| 4)  If one of the fields in the end of central directory␊ | 
| record is too small to hold required data, the field␊ | 
| should be set to -1 (0xFFFF or 0xFFFFFFFF) and the␊ | 
| ZIP64 format record should be created.␊ | 
| ␊ | 
| 5)  The end of central directory record and the␊ | 
| Zip64 end of central directory locator record must␊ | 
| reside on the same disk when splitting or spanning␊ | 
| an archive.␊ | 
| ␊ | 
| VI. UnShrinking - Method 1␊ | 
| --------------------------␊ | 
| ␊ | 
| Shrinking is a Dynamic Ziv-Lempel-Welch compression algorithm␊ | 
| with partial clearing.  The initial code size is 9 bits, and␊ | 
| the maximum code size is 13 bits.  Shrinking differs from␊ | 
| conventional Dynamic Ziv-Lempel-Welch implementations in several␊ | 
| respects:␊ | 
| ␊ | 
| 1)  The code size is controlled by the compressor, and is not␊ | 
| automatically increased when codes larger than the current␊ | 
| code size are created (but not necessarily used).  When␊ | 
| the decompressor encounters the code sequence 256␊ | 
| (decimal) followed by 1, it should increase the code size␊ | 
| read from the input stream to the next bit size.  No␊ | 
| blocking of the codes is performed, so the next code at␊ | 
| the increased size should be read from the input stream␊ | 
| immediately after where the previous code at the smaller␊ | 
| bit size was read.  Again, the decompressor should not␊ | 
| increase the code size used until the sequence 256,1 is␊ | 
| encountered.␊ | 
| ␊ | 
| 2)  When the table becomes full, total clearing is not␊ | 
| performed.  Rather, when the compressor emits the code␊ | 
| sequence 256,2 (decimal), the decompressor should clear␊ | 
| all leaf nodes from the Ziv-Lempel tree, and continue to␊ | 
| use the current code size.  The nodes that are cleared␊ | 
| from the Ziv-Lempel tree are then re-used, with the lowest␊ | 
| code value re-used first, and the highest code value␊ | 
| re-used last.  The compressor can emit the sequence 256,2␊ | 
| at any time.␊ | 
| ␊ | 
| VII. Expanding - Methods 2-5␊ | 
| ----------------------------␊ | 
| ␊ | 
| The Reducing algorithm is actually a combination of two␊ | 
| distinct algorithms.  The first algorithm compresses repeated␊ | 
| byte sequences, and the second algorithm takes the compressed␊ | 
| stream from the first algorithm and applies a probabilistic␊ | 
| compression method.␊ | 
| ␊ | 
| The probabilistic compression stores an array of 'follower␊ | 
| sets' S(j), for j=0 to 255, corresponding to each possible␊ | 
| ASCII character.  Each set contains between 0 and 32␊ | 
| characters, to be denoted as S(j)[0],...,S(j)[m], where m<32.␊ | 
| The sets are stored at the beginning of the data area for a␊ | 
| Reduced file, in reverse order, with S(255) first, and S(0)␊ | 
| last.␊ | 
| ␊ | 
| The sets are encoded as { N(j), S(j)[0],...,S(j)[N(j)-1] },␊ | 
| where N(j) is the size of set S(j).  N(j) can be 0, in which␊ | 
| case the follower set for S(j) is empty.  Each N(j) value is␊ | 
| encoded in 6 bits, followed by N(j) eight bit character values␊ | 
| corresponding to S(j)[0] to S(j)[N(j)-1] respectively.  If␊ | 
| N(j) is 0, then no values for S(j) are stored, and the value␊ | 
| for N(j-1) immediately follows.␊ | 
| ␊ | 
| Immediately after the follower sets, is the compressed data␊ | 
| stream.  The compressed data stream can be interpreted for the␊ | 
| probabilistic decompression as follows:␊ | 
| ␊ | 
| let Last-Character <- 0.␊ | 
| loop until done␊ | 
| if the follower set S(Last-Character) is empty then␊ | 
| read 8 bits from the input stream, and copy this␊ | 
| value to the output stream.␊ | 
| otherwise if the follower set S(Last-Character) is non-empty then␊ | 
| read 1 bit from the input stream.␊ | 
| if this bit is not zero then␊ | 
| read 8 bits from the input stream, and copy this␊ | 
| value to the output stream.␊ | 
| otherwise if this bit is zero then␊ | 
| read B(N(Last-Character)) bits from the input␊ | 
| stream, and assign this value to I.␊ | 
| Copy the value of S(Last-Character)[I] to the␊ | 
| output stream.␊ | 
| ␊ | 
| assign the last value placed on the output stream to␊ | 
| Last-Character.␊ | 
| end loop␊ | 
| ␊ | 
| B(N(j)) is defined as the minimal number of bits required to␊ | 
| encode the value N(j)-1.␊ | 
| ␊ | 
| The decompressed stream from above can then be expanded to␊ | 
| re-create the original file as follows:␊ | 
| ␊ | 
| let State <- 0.␊ | 
| ␊ | 
| loop until done␊ | 
| read 8 bits from the input stream into C.␊ | 
| case State of␊ | 
| 0:  if C is not equal to DLE (144 decimal) then␊ | 
| copy C to the output stream.␊ | 
| otherwise if C is equal to DLE then␊ | 
| let State <- 1.␊ | 
| ␊ | 
| 1:  if C is non-zero then␊ | 
| let V <- C.␊ | 
| let Len <- L(V)␊ | 
| let State <- F(Len).␊ | 
| otherwise if C is zero then␊ | 
| copy the value 144 (decimal) to the output stream.␊ | 
| let State <- 0␊ | 
| ␊ | 
| 2:  let Len <- Len + C␊ | 
| let State <- 3.␊ | 
| ␊ | 
| 3:  move backwards D(V,C) bytes in the output stream␊ | 
| (if this position is before the start of the output␊ | 
| stream, then assume that all the data before the␊ | 
| start of the output stream is filled with zeros).␊ | 
| copy Len+3 bytes from this position to the output stream.␊ | 
| let State <- 0.␊ | 
| end case␊ | 
| end loop␊ | 
| ␊ | 
| The functions F,L, and D are dependent on the 'compression␊ | 
| factor', 1 through 4, and are defined as follows:␊ | 
| ␊ | 
| For compression factor 1:␊ | 
| L(X) equals the lower 7 bits of X.␊ | 
| F(X) equals 2 if X equals 127 otherwise F(X) equals 3.␊ | 
| D(X,Y) equals the (upper 1 bit of X) * 256 + Y + 1.␊ | 
| For compression factor 2:␊ | 
| L(X) equals the lower 6 bits of X.␊ | 
| F(X) equals 2 if X equals 63 otherwise F(X) equals 3.␊ | 
| D(X,Y) equals the (upper 2 bits of X) * 256 + Y + 1.␊ | 
| For compression factor 3:␊ | 
| L(X) equals the lower 5 bits of X.␊ | 
| F(X) equals 2 if X equals 31 otherwise F(X) equals 3.␊ | 
| D(X,Y) equals the (upper 3 bits of X) * 256 + Y + 1.␊ | 
| For compression factor 4:␊ | 
| L(X) equals the lower 4 bits of X.␊ | 
| F(X) equals 2 if X equals 15 otherwise F(X) equals 3.␊ | 
| D(X,Y) equals the (upper 4 bits of X) * 256 + Y + 1.␊ | 
| ␊ | 
| VIII. Imploding - Method 6␊ | 
| --------------------------␊ | 
| ␊ | 
| The Imploding algorithm is actually a combination of two distinct␊ | 
| algorithms.  The first algorithm compresses repeated byte␊ | 
| sequences using a sliding dictionary.  The second algorithm is␊ | 
| used to compress the encoding of the sliding dictionary output,␊ | 
| using multiple Shannon-Fano trees.␊ | 
| ␊ | 
| The Imploding algorithm can use a 4K or 8K sliding dictionary␊ | 
| size. The dictionary size used can be determined by bit 1 in the␊ | 
| general purpose flag word; a 0 bit indicates a 4K dictionary␊ | 
| while a 1 bit indicates an 8K dictionary.␊ | 
| ␊ | 
| The Shannon-Fano trees are stored at the start of the compressed␊ | 
| file. The number of trees stored is defined by bit 2 in the␊ | 
| general purpose flag word; a 0 bit indicates two trees stored, a␊ | 
| 1 bit indicates three trees are stored.  If 3 trees are stored,␊ | 
| the first Shannon-Fano tree represents the encoding of the␊ | 
| Literal characters, the second tree represents the encoding of␊ | 
| the Length information, the third represents the encoding of the␊ | 
| Distance information.  When 2 Shannon-Fano trees are stored, the␊ | 
| Length tree is stored first, followed by the Distance tree.␊ | 
| ␊ | 
| The Literal Shannon-Fano tree, if present is used to represent␊ | 
| the entire ASCII character set, and contains 256 values.  This␊ | 
| tree is used to compress any data not compressed by the sliding␊ | 
| dictionary algorithm.  When this tree is present, the Minimum␊ | 
| Match Length for the sliding dictionary is 3.  If this tree is␊ | 
| not present, the Minimum Match Length is 2.␊ | 
| ␊ | 
| The Length Shannon-Fano tree is used to compress the Length part␊ | 
| of the (length,distance) pairs from the sliding dictionary␊ | 
| output.  The Length tree contains 64 values, ranging from the␊ | 
| Minimum Match Length, to 63 plus the Minimum Match Length.␊ | 
| ␊ | 
| The Distance Shannon-Fano tree is used to compress the Distance␊ | 
| part of the (length,distance) pairs from the sliding dictionary␊ | 
| output. The Distance tree contains 64 values, ranging from 0 to␊ | 
| 63, representing the upper 6 bits of the distance value.  The␊ | 
| distance values themselves will be between 0 and the sliding␊ | 
| dictionary size, either 4K or 8K.␊ | 
| ␊ | 
| The Shannon-Fano trees themselves are stored in a compressed␊ | 
| format. The first byte of the tree data represents the number of␊ | 
| bytes of data representing the (compressed) Shannon-Fano tree␊ | 
| minus 1.  The remaining bytes represent the Shannon-Fano tree␊ | 
| data encoded as:␊ | 
| ␊ | 
| High 4 bits: Number of values at this bit length + 1. (1 - 16)␊ | 
| Low  4 bits: Bit Length needed to represent value + 1. (1 - 16)␊ | 
| ␊ | 
| The Shannon-Fano codes can be constructed from the bit lengths␊ | 
| using the following algorithm:␊ | 
| ␊ | 
| 1)  Sort the Bit Lengths in ascending order, while retaining the␊ | 
| order of the original lengths stored in the file.␊ | 
| ␊ | 
| 2)  Generate the Shannon-Fano trees:␊ | 
| ␊ | 
| Code <- 0␊ | 
| CodeIncrement <- 0␊ | 
| LastBitLength <- 0␊ | 
| i <- number of Shannon-Fano codes - 1   (either 255 or 63)␊ | 
| ␊ | 
| loop while i >= 0␊ | 
| Code = Code + CodeIncrement␊ | 
| if BitLength(i) <> LastBitLength then␊ | 
| LastBitLength=BitLength(i)␊ | 
| CodeIncrement = 1 shifted left (16 - LastBitLength)␊ | 
| ShannonCode(i) = Code␊ | 
| i <- i - 1␊ | 
| end loop␊ | 
| ␊ | 
| 3)  Reverse the order of all the bits in the above ShannonCode()␊ | 
| vector, so that the most significant bit becomes the least␊ | 
| significant bit.  For example, the value 0x1234 (hex) would␊ | 
| become 0x2C48 (hex).␊ | 
| ␊ | 
| 4)  Restore the order of Shannon-Fano codes as originally stored␊ | 
| within the file.␊ | 
| ␊ | 
| Example:␊ | 
| ␊ | 
| This example will show the encoding of a Shannon-Fano tree␊ | 
| of size 8.  Notice that the actual Shannon-Fano trees used␊ | 
| for Imploding are either 64 or 256 entries in size.␊ | 
| ␊ | 
| Example:   0x02, 0x42, 0x01, 0x13␊ | 
| ␊ | 
| The first byte indicates 3 values in this table.  Decoding the␊ | 
| bytes:␊ | 
| 0x42 = 5 codes of 3 bits long␊ | 
| 0x01 = 1 code  of 2 bits long␊ | 
| 0x13 = 2 codes of 4 bits long␊ | 
| ␊ | 
| This would generate the original bit length array of:␊ | 
| (3, 3, 3, 3, 3, 2, 4, 4)␊ | 
| ␊ | 
| There are 8 codes in this table for the values 0 thru 7.  Using ␊ | 
| the algorithm to obtain the Shannon-Fano codes produces:␊ | 
| ␊ | 
| Reversed     Order     Original␊ | 
| Val  Sorted   Constructed Code      Value     Restored    Length␊ | 
| ---  ------   -----------------   --------    --------    ------␊ | 
| 0:     2      1100000000000000        11       101          3␊ | 
| 1:     3      1010000000000000       101       001          3␊ | 
| 2:     3      1000000000000000       001       110          3␊ | 
| 3:     3      0110000000000000       110       010          3␊ | 
| 4:     3      0100000000000000       010       100          3␊ | 
| 5:     3      0010000000000000       100        11          2␊ | 
| 6:     4      0001000000000000      1000      1000          4␊ | 
| 7:     4      0000000000000000      0000      0000          4␊ | 
| ␊ | 
| The values in the Val, Order Restored and Original Length columns␊ | 
| now represent the Shannon-Fano encoding tree that can be used for␊ | 
| decoding the Shannon-Fano encoded data.  How to parse the␊ | 
| variable length Shannon-Fano values from the data stream is beyond␊ | 
| the scope of this document.  (See the references listed at the end of␊ | 
| this document for more information.)  However, traditional decoding␊ | 
| schemes used for Huffman variable length decoding, such as the␊ | 
| Greenlaw algorithm, can be successfully applied.␊ | 
| ␊ | 
| The compressed data stream begins immediately after the␊ | 
| compressed Shannon-Fano data.  The compressed data stream can be␊ | 
| interpreted as follows:␊ | 
| ␊ | 
| loop until done␊ | 
| read 1 bit from input stream.␊ | 
| ␊ | 
| if this bit is non-zero then       (encoded data is literal data)␊ | 
| if Literal Shannon-Fano tree is present␊ | 
| read and decode character using Literal Shannon-Fano tree.␊ | 
| otherwise␊ | 
| read 8 bits from input stream.␊ | 
| copy character to the output stream.␊ | 
| otherwise              (encoded data is sliding dictionary match)␊ | 
| if 8K dictionary size␊ | 
| read 7 bits for offset Distance (lower 7 bits of offset).␊ | 
| otherwise␊ | 
| read 6 bits for offset Distance (lower 6 bits of offset).␊ | 
| ␊ | 
| using the Distance Shannon-Fano tree, read and decode the␊ | 
| upper 6 bits of the Distance value.␊ | 
| ␊ | 
| using the Length Shannon-Fano tree, read and decode␊ | 
| the Length value.␊ | 
| ␊ | 
| Length <- Length + Minimum Match Length␊ | 
| ␊ | 
| if Length = 63 + Minimum Match Length␊ | 
| read 8 bits from the input stream,␊ | 
| add this value to Length.␊ | 
| ␊ | 
| move backwards Distance+1 bytes in the output stream, and␊ | 
| copy Length characters from this position to the output␊ | 
| stream.  (if this position is before the start of the output␊ | 
| stream, then assume that all the data before the start of␊ | 
| the output stream is filled with zeros).␊ | 
| end loop␊ | 
| ␊ | 
| IX. Tokenizing - Method 7␊ | 
| -------------------------␊ | 
| ␊ | 
| This method is not used by PKZIP.␊ | 
| ␊ | 
| X. Deflating - Method 8␊ | 
| -----------------------␊ | 
| ␊ | 
| The Deflate algorithm is similar to the Implode algorithm using␊ | 
| a sliding dictionary of up to 32K with secondary compression␊ | 
| from Huffman/Shannon-Fano codes.␊ | 
| ␊ | 
| The compressed data is stored in blocks with a header describing␊ | 
| the block and the Huffman codes used in the data block.  The header␊ | 
| format is as follows:␊ | 
| ␊ | 
| Bit 0: Last Block bit     This bit is set to 1 if this is the last␊ | 
| compressed block in the data.␊ | 
| Bits 1-2: Block type␊ | 
| 00 (0) - Block is stored - All stored data is byte aligned.␊ | 
| Skip bits until next byte, then next word = block ␊ | 
| length, followed by the ones compliment of the block␊ | 
| length word. Remaining data in block is the stored ␊ | 
| data.␊ | 
| ␊ | 
| 01 (1) - Use fixed Huffman codes for literal and distance codes.␊ | 
| Lit Code    Bits             Dist Code   Bits␊ | 
| ---------   ----             ---------   ----␊ | 
| 0 - 143    8                 0 - 31      5␊ | 
| 144 - 255    9␊ | 
| 256 - 279    7␊ | 
| 280 - 287    8␊ | 
| ␊ | 
| Literal codes 286-287 and distance codes 30-31 are ␊ | 
| never used but participate in the huffman construction.␊ | 
| ␊ | 
| 10 (2) - Dynamic Huffman codes.  (See expanding Huffman codes)␊ | 
| ␊ | 
| 11 (3) - Reserved - Flag a "Error in compressed data" if seen.␊ | 
| ␊ | 
| Expanding Huffman Codes␊ | 
| -----------------------␊ | 
| If the data block is stored with dynamic Huffman codes, the Huffman␊ | 
| codes are sent in the following compressed format:␊ | 
| ␊ | 
| 5 Bits: # of Literal codes sent - 256 (256 - 286)␊ | 
| All other codes are never sent.␊ | 
| 5 Bits: # of Dist codes - 1           (1 - 32)␊ | 
| 4 Bits: # of Bit Length codes - 3     (3 - 19)␊ | 
| ␊ | 
| The Huffman codes are sent as bit lengths and the codes are built as␊ | 
| described in the implode algorithm.  The bit lengths themselves are␊ | 
| compressed with Huffman codes.  There are 19 bit length codes:␊ | 
| ␊ | 
| 0 - 15: Represent bit lengths of 0 - 15␊ | 
| 16: Copy the previous bit length 3 - 6 times.␊ | 
| The next 2 bits indicate repeat length (0 = 3, ... ,3 = 6)␊ | 
| Example:  Codes 8, 16 (+2 bits 11), 16 (+2 bits 10) will␊ | 
| expand to 12 bit lengths of 8 (1 + 6 + 5)␊ | 
| 17: Repeat a bit length of 0 for 3 - 10 times. (3 bits of length)␊ | 
| 18: Repeat a bit length of 0 for 11 - 138 times (7 bits of length)␊ | 
| ␊ | 
| The lengths of the bit length codes are sent packed 3 bits per value␊ | 
| (0 - 7) in the following order:␊ | 
| ␊ | 
| 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15␊ | 
| ␊ | 
| The Huffman codes should be built as described in the Implode algorithm␊ | 
| except codes are assigned starting at the shortest bit length, i.e. the␊ | 
| shortest code should be all 0's rather than all 1's.  Also, codes with␊ | 
| a bit length of zero do not participate in the tree construction.  The␊ | 
| codes are then used to decode the bit lengths for the literal and ␊ | 
| distance tables.␊ | 
| ␊ | 
| The bit lengths for the literal tables are sent first with the number␊ | 
| of entries sent described by the 5 bits sent earlier.  There are up␊ | 
| to 286 literal characters; the first 256 represent the respective 8␊ | 
| bit character, code 256 represents the End-Of-Block code, the remaining␊ | 
| 29 codes represent copy lengths of 3 thru 258.  There are up to 30␊ | 
| distance codes representing distances from 1 thru 32k as described␊ | 
| below.␊ | 
| ␊ | 
| Length Codes␊ | 
| ------------␊ | 
| Extra             Extra              Extra              Extra␊ | 
| Code Bits Length  Code Bits Lengths  Code Bits Lengths  Code Bits Length(s)␊ | 
| ---- ---- ------  ---- ---- -------  ---- ---- -------  ---- ---- ---------␊ | 
| 257   0     3     265   1   11,12    273   3   35-42    281   5  131-162␊ | 
| 258   0     4     266   1   13,14    274   3   43-50    282   5  163-194␊ | 
| 259   0     5     267   1   15,16    275   3   51-58    283   5  195-226␊ | 
| 260   0     6     268   1   17,18    276   3   59-66    284   5  227-257␊ | 
| 261   0     7     269   2   19-22    277   4   67-82    285   0    258␊ | 
| 262   0     8     270   2   23-26    278   4   83-98␊ | 
| 263   0     9     271   2   27-30    279   4   99-114␊ | 
| 264   0    10     272   2   31-34    280   4  115-130␊ | 
| ␊ | 
| Distance Codes␊ | 
| --------------␊ | 
| Extra           Extra             Extra               Extra␊ | 
| Code Bits Dist  Code Bits  Dist   Code Bits Distance  Code Bits Distance␊ | 
| ---- ---- ----  ---- ---- ------  ---- ---- --------  ---- ---- --------␊ | 
| 0   0    1      8   3   17-24    16    7  257-384    24   11  4097-6144␊ | 
| 1   0    2      9   3   25-32    17    7  385-512    25   11  6145-8192␊ | 
| 2   0    3     10   4   33-48    18    8  513-768    26   12  8193-12288␊ | 
| 3   0    4     11   4   49-64    19    8  769-1024   27   12 12289-16384␊ | 
| 4   1   5,6    12   5   65-96    20    9 1025-1536   28   13 16385-24576␊ | 
| 5   1   7,8    13   5   97-128   21    9 1537-2048   29   13 24577-32768␊ | 
| 6   2   9-12   14   6  129-192   22   10 2049-3072␊ | 
| 7   2  13-16   15   6  193-256   23   10 3073-4096␊ | 
| ␊ | 
| The compressed data stream begins immediately after the␊ | 
| compressed header data.  The compressed data stream can be␊ | 
| interpreted as follows:␊ | 
| ␊ | 
| do␊ | 
| read header from input stream.␊ | 
| ␊ | 
| if stored block␊ | 
| skip bits until byte aligned␊ | 
| read count and 1's compliment of count␊ | 
| copy count bytes data block␊ | 
| otherwise␊ | 
| loop until end of block code sent␊ | 
| decode literal character from input stream␊ | 
| if literal < 256␊ | 
| copy character to the output stream␊ | 
| otherwise␊ | 
| if literal = end of block␊ | 
| break from loop␊ | 
| otherwise␊ | 
| decode distance from input stream␊ | 
| ␊ | 
| move backwards distance bytes in the output stream, and␊ | 
| copy length characters from this position to the output␊ | 
| stream.␊ | 
| end loop␊ | 
| while not last block␊ | 
| ␊ | 
| if data descriptor exists␊ | 
| skip bits until byte aligned␊ | 
| read crc and sizes␊ | 
| endif␊ | 
| ␊ | 
| XI. Enhanced Deflating - Method 9␊ | 
| ---------------------------------␊ | 
| ␊ | 
| The Enhanced Deflating algorithm is similar to Deflate but␊ | 
| uses a sliding dictionary of up to 64K. Deflate64(tm) is supported␊ | 
| by the Deflate extractor. ␊ | 
| ␊ | 
| XII. BZIP2 - Method 12␊ | 
| ----------------------␊ | 
| ␊ | 
| BZIP2 is an open-source data compression algorithm developed by ␊ | 
| Julian Seward.  Information and source code for this algorithm␊ | 
| can be found on the internet.␊ | 
| ␊ | 
| XIII. LZMA - Method 14 (EFS)␊ | 
| ----------------------------␊ | 
| ␊ | 
| LZMA is a block-oriented, general purpose data compression algorithm  ␊ | 
| developed and maintained by Igor Pavlov.  It is a derivative of LZ77␊ | 
| that utilizes Markov chains and a range coder.  Information and ␊ | 
| source code for this algorithm can be found on the internet.  Consult ␊ | 
| with the author of this algorithm for information on terms or ␊ | 
| restrictions on use.␊ | 
| ␊ | 
| Support for LZMA within the ZIP format is defined as follows:   ␊ | 
| ␊ | 
| The Compression method field within the ZIP Local and Central ␊ | 
| Header records will be set to the value 14 to indicate data was␊ | 
| compressed using LZMA. ␊ | 
| ␊ | 
| The Version needed to extract field within the ZIP Local and ␊ | 
| Central Header records will be set to 6.3 to indicate the ␊ | 
| minimum ZIP format version supporting this feature.␊ | 
| ␊ | 
| File data compressed using the LZMA algorithm must be placed ␊ | 
| immediately following the Local Header for the file.  If a ␊ | 
| standard ZIP encryption header is required, it will follow ␊ | 
| the Local Header and will precede the LZMA compressed file ␊ | 
| data segment.  The location of LZMA compressed data segment ␊ | 
| within the ZIP format will be as shown:␊ | 
| ␊ | 
| [local header file 1]␊ | 
| [encryption header file 1]␊ | 
| [LZMA compressed data segment for file 1]␊ | 
| [data descriptor 1]␊ | 
| [local header file 2]␊ | 
| ␊ | 
| The encryption header and data descriptor records may␊ | 
| be conditionally present.  The LZMA Compressed Data Segment ␊ | 
| will consist of an LZMA Properties Header followed by the ␊ | 
| LZMA Compressed Data as shown:␊ | 
| ␊ | 
| [LZMA properties header for file 1]␊ | 
| [LZMA compressed data for file 1]␊ | 
| ␊ | 
| The LZMA Compressed Data will be stored as provided by the ␊ | 
| LZMA compression library.  Compressed size, uncompressed ␊ | 
| size and other file characteristics about the file being ␊ | 
| compressed must be stored in standard ZIP storage format.␊ | 
| ␊ | 
| The LZMA Properties Header will store specific data required to ␊ | 
| decompress the LZMA compressed Data.  This data is set by the ␊ | 
| LZMA compression engine using the function WriteCoderProperties() ␊ | 
| as documented within the LZMA SDK. ␊ | 
| ␊ | 
| Storage fields for the property information within the LZMA ␊ | 
| Properties Header are as follows:␊ | 
| ␊ | 
| LZMA Version Information 2 bytes␊ | 
| LZMA Properties Size 2 bytes␊ | 
| LZMA Properties Data variable, defined by "LZMA Properties Size"␊ | 
| ␊ | 
| LZMA Version Information - this field identifies which version of ␊ | 
| the LZMA SDK was used to compress a file.  The first byte will ␊ | 
| store the major version number of the LZMA SDK and the second ␊ | 
| byte will store the minor number.  ␊ | 
| ␊ | 
| LZMA Properties Size - this field defines the size of the remaining ␊ | 
| property data.  Typically this size should be determined by the ␊ | 
| version of the SDK.  This size field is included as a convenience␊ | 
| and to help avoid any ambiguity should it arise in the future due␊ | 
| to changes in this compression algorithm. ␊ | 
| ␊ | 
| LZMA Property Data - this variable sized field records the required ␊ | 
| values for the decompressor as defined by the LZMA SDK.  The ␊ | 
| data stored in this field should be obtained using the ␊ | 
| WriteCoderProperties() in the version of the SDK defined by ␊ | 
| the "LZMA Version Information" field.  ␊ | 
| ␊ | 
| The layout of the "LZMA Properties Data" field is a function of the␊ | 
| LZMA compression algorithm.  It is possible that this layout may be␊ | 
| changed by the author over time.  The data layout in version 4.32 ␊ | 
| of the LZMA SDK defines a 5 byte array that uses 4 bytes to store ␊ | 
| the dictionary size in little-endian order. This is preceded by a ␊ | 
| single packed byte as the first element of the array that contains␊ | 
| the following fields:␊ | 
| ␊ | 
| PosStateBits␊ | 
| LiteralPosStateBits␊ | 
| LiteralContextBits␊ | 
| ␊ | 
| Refer to the LZMA documentation for a more detailed explanation of ␊ | 
| these fields.  ␊ | 
| ␊ | 
| Data compressed with method 14, LZMA, may include an end-of-stream␊ | 
| (EOS) marker ending the compressed data stream.  This marker is not␊ | 
| required, but its use is highly recommended to facilitate processing␊ | 
| and implementers should include the EOS marker whenever possible.␊ | 
| When the EOS marker is used, general purpose bit 1 must be set.  If␊ | 
| general purpose bit 1 is not set, the EOS marker is not present.␊ | 
| ␊ | 
| XIV. PPMd - Method 98␊ | 
| ---------------------␊ | 
| ␊ | 
| PPMd is a data compression algorithm developed by Dmitry Shkarin␊ | 
| which includes a carryless rangecoder developed by Dmitry Subbotin.␊ | 
| This algorithm is based on predictive phrase matching on multiple␊ | 
| order contexts.  Information and source code for this algorithm␊ | 
| can be found on the internet. Consult with the author of this␊ | 
| algorithm for information on terms or restrictions on use.␊ | 
| ␊ | 
| Support for PPMd within the ZIP format currently is provided only ␊ | 
| for version I, revision 1 of the algorithm.  Storage requirements␊ | 
| for using this algorithm are as follows:␊ | 
| ␊ | 
| Parameters needed to control the algorithm are stored in the two␊ | 
| bytes immediately preceding the compressed data.  These bytes are␊ | 
| used to store the following fields:␊ | 
| ␊ | 
| Model order - sets the maximum model order, default is 8, possible␊ | 
| values are from 2 to 16 inclusive␊ | 
| ␊ | 
| Sub-allocator size - sets the size of sub-allocator in MB, default is 50,␊ | 
| possible values are from 1MB to 256MB inclusive␊ | 
| ␊ | 
| Model restoration method - sets the method used to restart context␊ | 
| model at memory insufficiency, values are:␊ | 
| ␊ | 
| 0 - restarts model from scratch - default␊ | 
| 1 - cut off model - decreases performance by as much as 2x␊ | 
| 2 - freeze context tree - not recommended␊ | 
| ␊ | 
| An example for packing these fields into the 2 byte storage field is␊ | 
| illustrated below.  These values are stored in Intel low-byte/high-byte␊ | 
| order.␊ | 
| ␊ | 
| wPPMd = (Model order - 1) + ␊ | 
| ((Sub-allocator size - 1) << 4) + ␊ | 
| (Model restoration method << 12)␊ | 
| ␊ | 
| ␊ | 
| XV. Traditional PKWARE Encryption␊ | 
| ---------------------------------␊ | 
| ␊ | 
| The following information discusses the decryption steps␊ | 
| required to support traditional PKWARE encryption.  This␊ | 
| form of encryption is considered weak by today's standards␊ | 
| and its use is recommended only for situations with␊ | 
| low security needs or for compatibility with older .ZIP ␊ | 
| applications.␊ | 
| ␊ | 
| Decryption␊ | 
| ----------␊ | 
| ␊ | 
| PKWARE is grateful to Mr. Roger Schlafly for his expert contribution ␊ | 
| towards the development of PKWARE's traditional encryption.␊ | 
| ␊ | 
| PKZIP encrypts the compressed data stream.  Encrypted files must␊ | 
| be decrypted before they can be extracted.␊ | 
| ␊ | 
| Each encrypted file has an extra 12 bytes stored at the start of␊ | 
| the data area defining the encryption header for that file.  The␊ | 
| encryption header is originally set to random values, and then␊ | 
| itself encrypted, using three, 32-bit keys.  The key values are␊ | 
| initialized using the supplied encryption password.  After each byte␊ | 
| is encrypted, the keys are then updated using pseudo-random number␊ | 
| generation techniques in combination with the same CRC-32 algorithm␊ | 
| used in PKZIP and described elsewhere in this document.␊ | 
| ␊ | 
| The following is the basic steps required to decrypt a file:␊ | 
| ␊ | 
| 1) Initialize the three 32-bit keys with the password.␊ | 
| 2) Read and decrypt the 12-byte encryption header, further␊ | 
| initializing the encryption keys.␊ | 
| 3) Read and decrypt the compressed data stream using the␊ | 
| encryption keys.␊ | 
| ␊ | 
| Step 1 - Initializing the encryption keys␊ | 
| -----------------------------------------␊ | 
| ␊ | 
| Key(0) <- 305419896␊ | 
| Key(1) <- 591751049␊ | 
| Key(2) <- 878082192␊ | 
| ␊ | 
| loop for i <- 0 to length(password)-1␊ | 
| update_keys(password(i))␊ | 
| end loop␊ | 
| ␊ | 
| Where update_keys() is defined as:␊ | 
| ␊ | 
| update_keys(char):␊ | 
| Key(0) <- crc32(key(0),char)␊ | 
| Key(1) <- Key(1) + (Key(0) & 000000ffH)␊ | 
| Key(1) <- Key(1) * 134775813 + 1␊ | 
| Key(2) <- crc32(key(2),key(1) >> 24)␊ | 
| end update_keys␊ | 
| ␊ | 
| Where crc32(old_crc,char) is a routine that given a CRC value and a␊ | 
| character, returns an updated CRC value after applying the CRC-32␊ | 
| algorithm described elsewhere in this document.␊ | 
| ␊ | 
| Step 2 - Decrypting the encryption header␊ | 
| -----------------------------------------␊ | 
| ␊ | 
| The purpose of this step is to further initialize the encryption␊ | 
| keys, based on random data, to render a plaintext attack on the␊ | 
| data ineffective.␊ | 
| ␊ | 
| Read the 12-byte encryption header into Buffer, in locations␊ | 
| Buffer(0) thru Buffer(11).␊ | 
| ␊ | 
| loop for i <- 0 to 11␊ | 
| C <- buffer(i) ^ decrypt_byte()␊ | 
| update_keys(C)␊ | 
| buffer(i) <- C␊ | 
| end loop␊ | 
| ␊ | 
| Where decrypt_byte() is defined as:␊ | 
| ␊ | 
| unsigned char decrypt_byte()␊ | 
| local unsigned short temp␊ | 
| temp <- Key(2) | 2␊ | 
| decrypt_byte <- (temp * (temp ^ 1)) >> 8␊ | 
| end decrypt_byte␊ | 
| ␊ | 
| After the header is decrypted,  the last 1 or 2 bytes in Buffer␊ | 
| should be the high-order word/byte of the CRC for the file being␊ | 
| decrypted, stored in Intel low-byte/high-byte order.  Versions of␊ | 
| PKZIP prior to 2.0 used a 2 byte CRC check; a 1 byte CRC check is␊ | 
| used on versions after 2.0.  This can be used to test if the password␊ | 
| supplied is correct or not.␊ | 
| ␊ | 
| Step 3 - Decrypting the compressed data stream␊ | 
| ----------------------------------------------␊ | 
| ␊ | 
| The compressed data stream can be decrypted as follows:␊ | 
| ␊ | 
| loop until done␊ | 
| read a character into C␊ | 
| Temp <- C ^ decrypt_byte()␊ | 
| update_keys(temp)␊ | 
| output Temp␊ | 
| end loop␊ | 
| ␊ | 
| ␊ | 
| XVI. Strong Encryption Specification␊ | 
| ------------------------------------␊ | 
| ␊ | 
| The Strong Encryption technology defined in this specification is ␊ | 
| covered under a pending patent application. The use or implementation␊ | 
| in a product of certain technological aspects set forth in the current␊ | 
| APPNOTE, including those with regard to strong encryption, patching, ␊ | 
| or extended tape operations requires a license from PKWARE. Portions␊ | 
| of this Strong Encryption technology are available for use at no charge.␊ | 
| Contact PKWARE for licensing terms and conditions. Refer to section II␊ | 
| of this APPNOTE (Contacting PKWARE) for information on how to ␊ | 
| contact PKWARE. ␊ | 
| ␊ | 
| Version 5.x of this specification introduced support for strong ␊ | 
| encryption algorithms.  These algorithms can be used with either ␊ | 
| a password or an X.509v3 digital certificate to encrypt each file. ␊ | 
| This format specification supports either password or certificate ␊ | 
| based encryption to meet the security needs of today, to enable ␊ | 
| interoperability between users within both PKI and non-PKI ␊ | 
| environments, and to ensure interoperability between different ␊ | 
| computing platforms that are running a ZIP program.  ␊ | 
| ␊ | 
| Password based encryption is the most common form of encryption ␊ | 
| people are familiar with.  However, inherent weaknesses with ␊ | 
| passwords (e.g. susceptibility to dictionary/brute force attack) ␊ | 
| as well as password management and support issues make certificate ␊ | 
| based encryption a more secure and scalable option.  Industry ␊ | 
| efforts and support are defining and moving towards more advanced ␊ | 
| security solutions built around X.509v3 digital certificates and ␊ | 
| Public Key Infrastructures(PKI) because of the greater scalability, ␊ | 
| administrative options, and more robust security over traditional ␊ | 
| password based encryption. ␊ | 
| ␊ | 
| Most standard encryption algorithms are supported with this␊ | 
| specification. Reference implementations for many of these ␊ | 
| algorithms are available from either commercial or open source ␊ | 
| distributors.  Readily available cryptographic toolkits make␊ | 
| implementation of the encryption features straight-forward.  ␊ | 
| This document is not intended to provide a treatise on data ␊ | 
| encryption principles or theory.  Its purpose is to document the ␊ | 
| data structures required for implementing interoperable data ␊ | 
| encryption within the .ZIP format.  It is strongly recommended that ␊ | 
| you have a good understanding of data encryption before reading ␊ | 
| further.␊ | 
| ␊ | 
| The algorithms introduced in Version 5.0 of this specification ␊ | 
| include:␊ | 
| ␊ | 
| RC2 40 bit, 64 bit, and 128 bit␊ | 
| RC4 40 bit, 64 bit, and 128 bit␊ | 
| DES␊ | 
| 3DES 112 bit and 168 bit␊ | 
| ␊ | 
| Version 5.1 adds support for the following:␊ | 
| ␊ | 
| AES 128 bit, 192 bit, and 256 bit␊ | 
| ␊ | 
| ␊ | 
| Version 6.1 introduces encryption data changes to support ␊ | 
| interoperability with Smartcard and USB Token certificate storage ␊ | 
| methods which do not support the OAEP strengthening standard.␊ | 
| ␊ | 
| Version 6.2 introduces support for encrypting metadata by compressing ␊ | 
| and encrypting the central directory data structure to reduce information ␊ | 
| leakage.   Information leakage can occur in legacy ZIP applications ␊ | 
| through exposure of information about a file even though that file is ␊ | 
| stored encrypted.  The information exposed consists of file ␊ | 
| characteristics stored within the records and fields defined by this ␊ | 
| specification.  This includes data such as a files name, its original ␊ | 
| size, timestamp and CRC32 value. ␊ | 
| ␊ | 
| Version 6.3 introduces support for encrypting data using the Blowfish␊ | 
| and Twofish algorithms.  These are symmetric block ciphers developed ␊ | 
| by Bruce Schneier.  Blowfish supports using a variable length key from ␊ | 
| 32 to 448 bits.  Block size is 64 bits.  Implementations should use 16␊ | 
| rounds and the only mode supported within ZIP files is CBC. Twofish ␊ | 
| supports key sizes 128, 192 and 256 bits.  Block size is 128 bits.  ␊ | 
| Implementations should use 16 rounds and the only mode supported within␊ | 
| ZIP files is CBC.  Information and source code for both Blowfish and ␊ | 
| Twofish algorithms can be found on the internet.  Consult with the author␊ | 
| of these algorithms for information on terms or restrictions on use.␊ | 
| ␊ | 
| Central Directory Encryption provides greater protection against ␊ | 
| information leakage by encrypting the Central Directory structure and ␊ | 
| by masking key values that are replicated in the unencrypted Local ␊ | 
| Header.   ZIP compatible programs that cannot interpret an encrypted ␊ | 
| Central Directory structure cannot rely on the data in the corresponding ␊ | 
| Local Header for decompression information.  ␊ | 
| ␊ | 
| Extra Field records that may contain information about a file that should ␊ | 
| not be exposed should not be stored in the Local Header and should only ␊ | 
| be written to the Central Directory where they can be encrypted.  This ␊ | 
| design currently does not support streaming.  Information in the End of ␊ | 
| Central Directory record, the Zip64 End of Central Directory Locator, ␊ | 
| and the Zip64 End of Central Directory records are not encrypted.  Access ␊ | 
| to view data on files within a ZIP file with an encrypted Central Directory␊ | 
| requires the appropriate password or private key for decryption prior to ␊ | 
| viewing any files, or any information about the files, in the archive.  ␊ | 
| ␊ | 
| Older ZIP compatible programs not familiar with the Central Directory ␊ | 
| Encryption feature will no longer be able to recognize the Central ␊ | 
| Directory and may assume the ZIP file is corrupt.  Programs that ␊ | 
| attempt streaming access using Local Headers will see invalid ␊ | 
| information for each file.  Central Directory Encryption need not be ␊ | 
| used for every ZIP file.  Its use is recommended for greater security.  ␊ | 
| ZIP files not using Central Directory Encryption should operate as ␊ | 
| in the past. ␊ | 
| ␊ | 
| This strong encryption feature specification is intended to provide for ␊ | 
| scalable, cross-platform encryption needs ranging from simple password␊ | 
| encryption to authenticated public/private key encryption.  ␊ | 
| ␊ | 
| Encryption provides data confidentiality and privacy.  It is ␊ | 
| recommended that you combine X.509 digital signing with encryption ␊ | 
| to add authentication and non-repudiation.␊ | 
| ␊ | 
| ␊ | 
| Single Password Symmetric Encryption Method:␊ | 
| -------------------------------------------␊ | 
| ␊ | 
| The Single Password Symmetric Encryption Method using strong ␊ | 
| encryption algorithms operates similarly to the traditional ␊ | 
| PKWARE encryption defined in this format.  Additional data ␊ | 
| structures are added to support the processing needs of the ␊ | 
| strong algorithms.␊ | 
| ␊ | 
| The Strong Encryption data structures are:␊ | 
| ␊ | 
| 1. General Purpose Bits - Bits 0 and 6 of the General Purpose bit ␊ | 
| flag in both local and central header records.  Both bits set ␊ | 
| indicates strong encryption.  Bit 13, when set indicates the Central␊ | 
| Directory is encrypted and that selected fields in the Local Header␊ | 
| are masked to hide their actual value.␊ | 
| ␊ | 
| ␊ | 
| 2. Extra Field 0x0017 in central header only.␊ | 
| ␊ | 
| Fields to consider in this record are:␊ | 
| ␊ | 
| Format - the data format identifier for this record.  The only␊ | 
| value allowed at this time is the integer value 2.␊ | 
| ␊ | 
| AlgId - integer identifier of the encryption algorithm from the␊ | 
| following range␊ | 
| ␊ | 
| 0x6601 - DES␊ | 
| 0x6602 - RC2 (version needed to extract < 5.2)␊ | 
| 0x6603 - 3DES 168␊ | 
| 0x6609 - 3DES 112␊ | 
| 0x660E - AES 128 ␊ | 
| 0x660F - AES 192 ␊ | 
| 0x6610 - AES 256 ␊ | 
| 0x6702 - RC2 (version needed to extract >= 5.2)␊ | 
| 0x6720 - Blowfish␊ | 
| 0x6721 - Twofish␊ | 
| 0x6801 - RC4␊ | 
| 0xFFFF - Unknown algorithm␊ | 
| ␊ | 
| Bitlen - Explicit bit length of key␊ | 
| ␊ | 
| 32 - 448 bits␊ | 
| ␊ | 
| Flags - Processing flags needed for decryption␊ | 
| ␊ | 
| 0x0001 - Password is required to decrypt␊ | 
| 0x0002 - Certificates only␊ | 
| 0x0003 - Password or certificate required to decrypt␊ | 
| ␊ | 
| Values > 0x0003 reserved for certificate processing␊ | 
| ␊ | 
| ␊ | 
| 3. Decryption header record preceding compressed file data.␊ | 
| ␊ | 
| -Decryption Header:␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| IVSize    2 bytes  Size of initialization vector (IV)␊ | 
| IVData    IVSize   Initialization vector for this file␊ | 
| Size      4 bytes  Size of remaining decryption header data␊ | 
| Format    2 bytes  Format definition for this record␊ | 
| AlgID     2 bytes  Encryption algorithm identifier␊ | 
| Bitlen    2 bytes  Bit length of encryption key␊ | 
| Flags     2 bytes  Processing flags␊ | 
| ErdSize   2 bytes  Size of Encrypted Random Data␊ | 
| ErdData   ErdSize  Encrypted Random Data␊ | 
| Reserved1 4 bytes  Reserved certificate processing data␊ | 
| Reserved2 (var)    Reserved for certificate processing data␊ | 
| VSize     2 bytes  Size of password validation data␊ | 
| VData     VSize-4  Password validation data␊ | 
| VCRC32    4 bytes  Standard ZIP CRC32 of password validation data␊ | 
| ␊ | 
| IVData - The size of the IV should match the algorithm block size.␊ | 
| The IVData can be completely random data.  If the size of␊ | 
| the randomly generated data does not match the block size␊ | 
| it should be complemented with zero's or truncated as␊ | 
| necessary.  If IVSize is 0,then IV = CRC32 + Uncompressed␊ | 
| File Size (as a 64 bit little-endian, unsigned integer value).␊ | 
| ␊ | 
| Format - the data format identifier for this record.  The only␊ | 
| value allowed at this time is the integer value 3.␊ | 
| ␊ | 
| AlgId - integer identifier of the encryption algorithm from the␊ | 
| following range␊ | 
| ␊ | 
| 0x6601 - DES␊ | 
| 0x6602 - RC2 (version needed to extract < 5.2)␊ | 
| 0x6603 - 3DES 168␊ | 
| 0x6609 - 3DES 112␊ | 
| 0x660E - AES 128 ␊ | 
| 0x660F - AES 192 ␊ | 
| 0x6610 - AES 256 ␊ | 
| 0x6702 - RC2 (version needed to extract >= 5.2)␊ | 
| 0x6720 - Blowfish␊ | 
| 0x6721 - Twofish␊ | 
| 0x6801 - RC4␊ | 
| 0xFFFF - Unknown algorithm␊ | 
| ␊ | 
| Bitlen - Explicit bit length of key␊ | 
| ␊ | 
| 32 - 448 bits␊ | 
| ␊ | 
| Flags - Processing flags needed for decryption␊ | 
| ␊ | 
| 0x0001 - Password is required to decrypt␊ | 
| 0x0002 - Certificates only␊ | 
| 0x0003 - Password or certificate required to decrypt␊ | 
| ␊ | 
| Values > 0x0003 reserved for certificate processing␊ | 
| ␊ | 
| ErdData - Encrypted random data is used to store random data that␊ | 
| is used to generate a file session key for encrypting ␊ | 
| each file.  SHA1 is used to calculate hash data used to ␊ | 
| derive keys.  File session keys are derived from a master ␊ | 
| session key generated from the user-supplied password.␊ | 
| If the Flags field in the decryption header contains ␊ | 
| the value 0x4000, then the ErdData field must be ␊ | 
| decrypted using 3DES. If the value 0x4000 is not set,␊ | 
| then the ErdData field must be decrypted using AlgId.␊ | 
| ␊ | 
| ␊ | 
| Reserved1 - Reserved for certificate processing, if value is␊ | 
| zero, then Reserved2 data is absent.  See the explanation␊ | 
| under the Certificate Processing Method for details on␊ | 
| this data structure.␊ | 
| ␊ | 
| Reserved2 - If present, the size of the Reserved2 data structure ␊ | 
| is located by skipping the first 4 bytes of this field ␊ | 
| and using the next 2 bytes as the remaining size.  See␊ | 
| the explanation under the Certificate Processing Method␊ | 
| for details on this data structure.␊ | 
| ␊ | 
| VSize - This size value will always include the 4 bytes of the␊ | 
| VCRC32 data and will be greater than 4 bytes.␊ | 
| ␊ | 
| VData - Random data for password validation.  This data is VSize␊ | 
| in length and VSize must be a multiple of the encryption␊ | 
| block size.  VCRC32 is a checksum value of VData.  ␊ | 
| VData and VCRC32 are stored encrypted and start the␊ | 
| stream of encrypted data for a file.␊ | 
| ␊ | 
| ␊ | 
| 4. Useful Tips␊ | 
| ␊ | 
| Strong Encryption is always applied to a file after compression. The␊ | 
| block oriented algorithms all operate in Cypher Block Chaining (CBC) ␊ | 
| mode.  The block size used for AES encryption is 16.  All other block␊ | 
| algorithms use a block size of 8.  Two ID's are defined for RC2 to ␊ | 
| account for a discrepancy found in the implementation of the RC2␊ | 
| algorithm in the cryptographic library on Windows XP SP1 and all ␊ | 
| earlier versions of Windows.  It is recommended that zero length files␊ | 
| not be encrypted, however programs should be prepared to extract them␊ | 
| if they are found within a ZIP file.␊ | 
| ␊ | 
| A pseudo-code representation of the encryption process is as follows:␊ | 
| ␊ | 
| Password = GetUserPassword()␊ | 
| MasterSessionKey = DeriveKey(SHA1(Password)) ␊ | 
| RD = CryptographicStrengthRandomData() ␊ | 
| For Each File␊ | 
| IV = CryptographicStrengthRandomData() ␊ | 
| VData = CryptographicStrengthRandomData()␊ | 
| VCRC32 = CRC32(VData)␊ | 
| FileSessionKey = DeriveKey(SHA1(IV + RD) ␊ | 
| ErdData = Encrypt(RD,MasterSessionKey,IV) ␊ | 
| Encrypt(VData + VCRC32 + FileData, FileSessionKey,IV)␊ | 
| Done␊ | 
| ␊ | 
| The function names and parameter requirements will depend on␊ | 
| the choice of the cryptographic toolkit selected.  Almost any␊ | 
| toolkit supporting the reference implementations for each␊ | 
| algorithm can be used.  The RSA BSAFE(r), OpenSSL, and Microsoft␊ | 
| CryptoAPI libraries are all known to work well.  ␊ | 
| ␊ | 
| ␊ | 
| Single Password - Central Directory Encryption:␊ | 
| -----------------------------------------------␊ | 
| ␊ | 
| Central Directory Encryption is achieved within the .ZIP format by ␊ | 
| encrypting the Central Directory structure.  This encapsulates the metadata ␊ | 
| most often used for processing .ZIP files.  Additional metadata is stored for ␊ | 
| redundancy in the Local Header for each file.  The process of concealing ␊ | 
| metadata by encrypting the Central Directory does not protect the data within ␊ | 
| the Local Header.  To avoid information leakage from the exposed metadata ␊ | 
| in the Local Header, the fields containing information about a file are masked.  ␊ | 
| ␊ | 
| Local Header:␊ | 
| ␊ | 
| Masking replaces the true content of the fields for a file in the Local ␊ | 
| Header with false information.  When masked, the Local Header is not ␊ | 
| suitable for streaming access and the options for data recovery of damaged␊ | 
| archives is reduced.  Extra Data fields that may contain confidential␊ | 
| data should not be stored within the Local Header.  The value set into␊ | 
| the Version needed to extract field should be the correct value needed to␊ | 
| extract the file without regard to Central Directory Encryption. The fields ␊ | 
| within the Local Header targeted for masking when the Central Directory is ␊ | 
| encrypted are:␊ | 
| ␊ | 
| Field Name                     Mask Value␊ | 
| ------------------             ---------------------------␊ | 
| compression method              0␊ | 
| last mod file time              0␊ | 
| last mod file date              0␊ | 
| crc-32                          0␊ | 
| compressed size                 0␊ | 
| uncompressed size               0␊ | 
| file name (variable size)       Base 16 value from the␊ | 
| range 1 - 0xFFFFFFFFFFFFFFFF␊ | 
| represented as a string whose␊ | 
| size will be set into the␊ | 
| file name length field␊ | 
| ␊ | 
| The Base 16 value assigned as a masked file name is simply a sequentially␊ | 
| incremented value for each file starting with 1 for the first file.  ␊ | 
| Modifications to a ZIP file may cause different values to be stored for ␊ | 
| each file.  For compatibility, the file name field in the Local Header ␊ | 
| should never be left blank.  As of Version 6.2 of this specification, ␊ | 
| the Compression Method and Compressed Size fields are not yet masked.␊ | 
| Fields having a value of 0xFFFF or 0xFFFFFFFF for the ZIP64 format␊ | 
| should not be masked.  ␊ | 
| ␊ | 
| Encrypting the Central Directory:␊ | 
| ␊ | 
| Encryption of the Central Directory does not include encryption of the ␊ | 
| Central Directory Signature data, the Zip64 End of Central Directory␊ | 
| record, the Zip64 End of Central Directory Locator, or the End␊ | 
| of Central Directory record.  The ZIP file comment data is never␊ | 
| encrypted.␊ | 
| ␊ | 
| Before encrypting the Central Directory, it may optionally be compressed.␊ | 
| Compression is not required, but for storage efficiency it is assumed␊ | 
| this structure will be compressed before encrypting.  Similarly, this ␊ | 
| specification supports compressing the Central Directory without␊ | 
| requiring that it also be encrypted.  Early implementations of this␊ | 
| feature will assume the encryption method applied to files matches the ␊ | 
| encryption applied to the Central Directory.␊ | 
| ␊ | 
| Encryption of the Central Directory is done in a manner similar to␊ | 
| that of file encryption.  The encrypted data is preceded by a ␊ | 
| decryption header.  The decryption header is known as the Archive␊ | 
| Decryption Header.  The fields of this record are identical to␊ | 
| the decryption header preceding each encrypted file.  The location␊ | 
| of the Archive Decryption Header is determined by the value in the␊ | 
| Start of the Central Directory field in the Zip64 End of Central␊ | 
| Directory record.  When the Central Directory is encrypted, the␊ | 
| Zip64 End of Central Directory record will always be present.␊ | 
| ␊ | 
| The layout of the Zip64 End of Central Directory record for all␊ | 
| versions starting with 6.2 of this specification will follow the␊ | 
| Version 2 format.  The Version 2 format is as follows:␊ | 
| ␊ | 
| The leading fixed size fields within the Version 1 format for this␊ | 
| record remain unchanged.  The record signature for both Version 1 ␊ | 
| and Version 2 will be 0x06064b50.  Immediately following the last␊ | 
| byte of the field known as the Offset of Start of Central ␊ | 
| Directory With Respect to the Starting Disk Number will begin the ␊ | 
| new fields defining Version 2 of this record.  ␊ | 
| ␊ | 
| New fields for Version 2:␊ | 
| ␊ | 
| Note: all fields stored in Intel low-byte/high-byte order.␊ | 
| ␊ | 
| Value                 Size       Description␊ | 
| -----                 ----       -----------␊ | 
| Compression Method    2 bytes    Method used to compress the␊ | 
| Central Directory␊ | 
| Compressed Size       8 bytes    Size of the compressed data␊ | 
| Original   Size       8 bytes    Original uncompressed size␊ | 
| AlgId                 2 bytes    Encryption algorithm ID␊ | 
| BitLen                2 bytes    Encryption key length␊ | 
| Flags                 2 bytes    Encryption flags␊ | 
| HashID                2 bytes    Hash algorithm identifier␊ | 
| Hash Length           2 bytes    Length of hash data␊ | 
| Hash Data             (variable) Hash data␊ | 
| ␊ | 
| The Compression Method accepts the same range of values as the ␊ | 
| corresponding field in the Central Header.␊ | 
| ␊ | 
| The Compressed Size and Original Size values will not include the␊ | 
| data of the Central Directory Signature which is compressed or␊ | 
| encrypted.␊ | 
| ␊ | 
| The AlgId, BitLen, and Flags fields accept the same range of values␊ | 
| the corresponding fields within the 0x0017 record. ␊ | 
| ␊ | 
| Hash ID identifies the algorithm used to hash the Central Directory ␊ | 
| data.  This data does not have to be hashed, in which case the␊ | 
| values for both the HashID and Hash Length will be 0.  Possible ␊ | 
| values for HashID are:␊ | 
| ␊ | 
| Value         Algorithm␊ | 
| ------         ---------␊ | 
| 0x0000          none␊ | 
| 0x0001          CRC32␊ | 
| 0x8003          MD5␊ | 
| 0x8004          SHA1␊ | 
| 0x8007          RIPEMD160␊ | 
| 0x800C          SHA256␊ | 
| 0x800D          SHA384␊ | 
| 0x800E          SHA512␊ | 
| ␊ | 
| When the Central Directory data is signed, the same hash algorithm␊ | 
| used to hash the Central Directory for signing should be used.␊ | 
| This is recommended for processing efficiency, however, it is ␊ | 
| permissible for any of the above algorithms to be used independent ␊ | 
| of the signing process.␊ | 
| ␊ | 
| The Hash Data will contain the hash data for the Central Directory.␊ | 
| The length of this data will vary depending on the algorithm used.␊ | 
| ␊ | 
| The Version Needed to Extract should be set to 62.␊ | 
| ␊ | 
| The value for the Total Number of Entries on the Current Disk will␊ | 
| be 0.  These records will no longer support random access when␊ | 
| encrypting the Central Directory.␊ | 
| ␊ | 
| When the Central Directory is compressed and/or encrypted, the␊ | 
| End of Central Directory record will store the value 0xFFFFFFFF␊ | 
| as the value for the Total Number of Entries in the Central␊ | 
| Directory.  The value stored in the Total Number of Entries in␊ | 
| the Central Directory on this Disk field will be 0.  The actual␊ | 
| values will be stored in the equivalent fields of the Zip64␊ | 
| End of Central Directory record.␊ | 
| ␊ | 
| Decrypting and decompressing the Central Directory is accomplished␊ | 
| in the same manner as decrypting and decompressing a file.␊ | 
| ␊ | 
| Certificate Processing Method:␊ | 
| -----------------------------␊ | 
| ␊ | 
| The Certificate Processing Method of for ZIP file encryption ␊ | 
| defines the following additional data fields:␊ | 
| ␊ | 
| 1. Certificate Flag Values␊ | 
| ␊ | 
| Additional processing flags that can be present in the Flags field of both ␊ | 
| the 0x0017 field of the central directory Extra Field and the Decryption ␊ | 
| header record preceding compressed file data are:␊ | 
| ␊ | 
| 0x0007 - reserved for future use␊ | 
| 0x000F - reserved for future use␊ | 
| 0x0100 - Indicates non-OAEP key wrapping was used.  If this␊ | 
| this field is set, the version needed to extract must␊ | 
| be at least 61.  This means OAEP key wrapping is not␊ | 
| used when generating a Master Session Key using␊ | 
| ErdData.␊ | 
| 0x4000 - ErdData must be decrypted using 3DES-168, otherwise use the␊ | 
| same algorithm used for encrypting the file contents.␊ | 
| 0x8000 - reserved for future use␊ | 
| ␊ | 
| ␊ | 
| 2. CertData - Extra Field 0x0017 record certificate data structure␊ | 
| ␊ | 
| The data structure used to store certificate data within the section␊ | 
| of the Extra Field defined by the CertData field of the 0x0017␊ | 
| record are as shown:␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| RCount    4 bytes  Number of recipients.  ␊ | 
| HashAlg   2 bytes  Hash algorithm identifier␊ | 
| HSize     2 bytes  Hash size␊ | 
| SRList    (var)    Simple list of recipients hashed public keys␊ | 
| ␊ | 
| ␊ | 
| RCount    This defines the number intended recipients whose ␊ | 
| public keys were used for encryption.  This identifies␊ | 
| the number of elements in the SRList.␊ | 
| ␊ | 
| HashAlg   This defines the hash algorithm used to calculate␊ | 
| the public key hash of each public key used␊ | 
| for encryption. This field currently supports␊ | 
| only the following value for SHA-1␊ | 
| ␊ | 
| 0x8004 - SHA1␊ | 
| ␊ | 
| HSize     This defines the size of a hashed public key.␊ | 
| ␊ | 
| SRList    This is a variable length list of the hashed ␊ | 
| public keys for each intended recipient.  Each ␊ | 
| element in this list is HSize.  The total size of ␊ | 
| SRList is determined using RCount * HSize.␊ | 
| ␊ | 
| ␊ | 
| 3. Reserved1 - Certificate Decryption Header Reserved1 Data:␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| RCount    4 bytes  Number of recipients.  ␊ | 
| ␊ | 
| RCount    This defines the number intended recipients whose ␊ | 
| public keys were used for encryption.  This defines␊ | 
| the number of elements in the REList field defined below.␊ | 
| ␊ | 
| ␊ | 
| 4. Reserved2 - Certificate Decryption Header Reserved2 Data Structures:␊ | 
| ␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| HashAlg   2 bytes  Hash algorithm identifier␊ | 
| HSize     2 bytes  Hash size␊ | 
| REList    (var)    List of recipient data elements␊ | 
| ␊ | 
| ␊ | 
| HashAlg   This defines the hash algorithm used to calculate␊ | 
| the public key hash of each public key used␊ | 
| for encryption. This field currently supports␊ | 
| only the following value for SHA-1␊ | 
| ␊ | 
| 0x8004 - SHA1␊ | 
| ␊ | 
| HSize     This defines the size of a hashed public key␊ | 
| defined in REHData.␊ | 
| ␊ | 
| REList    This is a variable length of list of recipient data.  ␊ | 
| Each element in this list consists of a Recipient␊ | 
| Element data structure as follows:␊ | 
| ␊ | 
| ␊ | 
| Recipient Element (REList) Data Structure:␊ | 
| ␊ | 
| Value     Size     Description␊ | 
| -----     ----     -----------␊ | 
| RESize    2 bytes  Size of REHData + REKData␊ | 
| REHData   HSize    Hash of recipients public key␊ | 
| REKData   (var)    Simple key blob␊ | 
| ␊ | 
| ␊ | 
| RESize    This defines the size of an individual REList ␊ | 
| element.  This value is the combined size of the␊ | 
| REHData field + REKData field.  REHData is defined by␊ | 
| HSize.  REKData is variable and can be calculated␊ | 
| for each REList element using RESize and HSize.␊ | 
| ␊ | 
| REHData   Hashed public key for this recipient.␊ | 
| ␊ | 
| REKData   Simple Key Blob.  The format of this data structure␊ | 
| is identical to that defined in the Microsoft␊ | 
| CryptoAPI and generated using the CryptExportKey()␊ | 
| function.  The version of the Simple Key Blob␊ | 
| supported at this time is 0x02 as defined by␊ | 
| Microsoft.␊ | 
| ␊ | 
| Certificate Processing - Central Directory Encryption:␊ | 
| ------------------------------------------------------␊ | 
| ␊ | 
| Central Directory Encryption using Digital Certificates will ␊ | 
| operate in a manner similar to that of Single Password Central␊ | 
| Directory Encryption.  This record will only be present when there ␊ | 
| is data to place into it.  Currently, data is placed into this␊ | 
| record when digital certificates are used for either encrypting ␊ | 
| or signing the files within a ZIP file.  When only password ␊ | 
| encryption is used with no certificate encryption or digital ␊ | 
| signing, this record is not currently needed. When present, this ␊ | 
| record will appear before the start of the actual Central Directory ␊ | 
| data structure and will be located immediately after the Archive ␊ | 
| Decryption Header if the Central Directory is encrypted.␊ | 
| ␊ | 
| The Archive Extra Data record will be used to store the following␊ | 
| information.  Additional data may be added in future versions.␊ | 
| ␊ | 
| Extra Data Fields:␊ | 
| ␊ | 
| 0x0014 - PKCS#7 Store for X.509 Certificates␊ | 
| 0x0016 - X.509 Certificate ID and Signature for central directory␊ | 
| 0x0019 - PKCS#7 Encryption Recipient Certificate List␊ | 
| ␊ | 
| The 0x0014 and 0x0016 Extra Data records that otherwise would be ␊ | 
| located in the first record of the Central Directory for digital ␊ | 
| certificate processing. When encrypting or compressing the Central ␊ | 
| Directory, the 0x0014 and 0x0016 records must be located in the ␊ | 
| Archive Extra Data record and they should not remain in the first ␊ | 
| Central Directory record.  The Archive Extra Data record will also ␊ | 
| be used to store the 0x0019 data. ␊ | 
| ␊ | 
| When present, the size of the Archive Extra Data record will be␊ | 
| included in the size of the Central Directory.  The data of the␊ | 
| Archive Extra Data record will also be compressed and encrypted␊ | 
| along with the Central Directory data structure.␊ | 
| ␊ | 
| Certificate Processing Differences:␊ | 
| ␊ | 
| The Certificate Processing Method of encryption differs from the␊ | 
| Single Password Symmetric Encryption Method as follows.  Instead␊ | 
| of using a user-defined password to generate a master session key,␊ | 
| cryptographically random data is used.  The key material is then␊ | 
| wrapped using standard key-wrapping techniques.  This key material␊ | 
| is wrapped using the public key of each recipient that will need␊ | 
| to decrypt the file using their corresponding private key.␊ | 
| ␊ | 
| This specification currently assumes digital certificates will follow␊ | 
| the X.509 V3 format for 1024 bit and higher RSA format digital␊ | 
| certificates.  Implementation of this Certificate Processing Method␊ | 
| requires supporting logic for key access and management.  This logic␊ | 
| is outside the scope of this specification.␊ | 
| ␊ | 
| OAEP Processing with Certificate-based Encryption:␊ | 
| ␊ | 
| OAEP stands for Optimal Asymmetric Encryption Padding.  It is a␊ | 
| strengthening technique used for small encoded items such as decryption␊ | 
| keys.  This is commonly applied in cryptographic key-wrapping techniques␊ | 
| and is supported by PKCS #1.  Versions 5.0 and 6.0 of this specification ␊ | 
| were designed to support OAEP key-wrapping for certificate-based ␊ | 
| decryption keys for additional security.  ␊ | 
| ␊ | 
| Support for private keys stored on Smartcards or Tokens introduced␊ | 
| a conflict with this OAEP logic.  Most card and token products do ␊ | 
| not support the additional strengthening applied to OAEP key-wrapped ␊ | 
| data.  In order to resolve this conflict, versions 6.1 and above of this ␊ | 
| specification will no longer support OAEP when encrypting using ␊ | 
| digital certificates. ␊ | 
| ␊ | 
| Versions of PKZIP available during initial development of the ␊ | 
| certificate processing method set a value of 61 into the ␊ | 
| version needed to extract field for a file.  This indicates that ␊ | 
| non-OAEP key wrapping is used.  This affects certificate encryption ␊ | 
| only, and password encryption functions should not be affected by ␊ | 
| this value.  This means values of 61 may be found on files encrypted␊ | 
| with certificates only, or on files encrypted with both password␊ | 
| encryption and certificate encryption.  Files encrypted with both␊ | 
| methods can safely be decrypted using the password methods documented.␊ | 
| ␊ | 
| XVII. Change Process␊ | 
| --------------------␊ | 
| ␊ | 
| In order for the .ZIP file format to remain a viable definition, this␊ | 
| specification should be considered as open for periodic review and␊ | 
| revision.  Although this format was originally designed with a ␊ | 
| certain level of extensibility, not all changes in technology␊ | 
| (present or future) were or will be necessarily considered in its␊ | 
| design.  If your application requires new definitions to the␊ | 
| extensible sections in this format, or if you would like to ␊ | 
| submit new data structures, please forward your request to␊ | 
| zipformat@pkware.com.  All submissions will be reviewed by the␊ | 
| ZIP File Specification Committee for possible inclusion into␊ | 
| future versions of this specification.  Periodic revisions␊ | 
| to this specification will be published to ensure interoperability. ␊ | 
| We encourage comments and feedback that may help improve clarity ␊ | 
| or content.␊ | 
| ␊ | 
| XVIII. Incorporating PKWARE Proprietary Technology into Your Product␊ | 
| --------------------------------------------------------------------␊ | 
| ␊ | 
| PKWARE is committed to the interoperability and advancement of the␊ | 
| .ZIP format.  PKWARE offers a free license for certain technological␊ | 
| aspects described above under certain restrictions and conditions.␊ | 
| However, the use or implementation in a product of certain technological␊ | 
| aspects set forth in the current APPNOTE, including those with regard to␊ | 
| strong encryption, patching, or extended tape operations requires a ␊ | 
| license from PKWARE.  Please contact PKWARE with regard to acquiring␊ | 
| a license.␊ | 
| ␊ | 
| XIX. Acknowledgements␊ | 
| ----------------------␊ | 
| ␊ | 
| In addition to the above mentioned contributors to PKZIP and PKUNZIP,␊ | 
| I would like to extend special thanks to Robert Mahoney for suggesting␊ | 
| the extension .ZIP for this software.␊ | 
| ␊ | 
| XX. References␊ | 
| --------------␊ | 
| ␊ | 
| Fiala, Edward R., and Greene, Daniel H., "Data compression with␊ | 
| finite windows",  Communications of the ACM, Volume 32, Number 4,␊ | 
| April 1989, pages 490-505.␊ | 
| ␊ | 
| Held, Gilbert, "Data Compression, Techniques and Applications,␊ | 
| Hardware and Software Considerations", John Wiley & Sons, 1987.␊ | 
| ␊ | 
| Huffman, D.A., "A method for the construction of minimum-redundancy␊ | 
| codes", Proceedings of the IRE, Volume 40, Number 9, September 1952,␊ | 
| pages 1098-1101.␊ | 
| ␊ | 
| Nelson, Mark, "LZW Data Compression", Dr. Dobbs Journal, Volume 14,␊ | 
| Number 10, October 1989, pages 29-37.␊ | 
| ␊ | 
| Nelson, Mark, "The Data Compression Book",  M&T Books, 1991.␊ | 
| ␊ | 
| Storer, James A., "Data Compression, Methods and Theory",␊ | 
| Computer Science Press, 1988␊ | 
| ␊ | 
| Welch, Terry, "A Technique for High-Performance Data Compression",␊ | 
| IEEE Computer, Volume 17, Number 6, June 1984, pages 8-19.␊ | 
| ␊ | 
| Ziv, J. and Lempel, A., "A universal algorithm for sequential data␊ | 
| compression", Communications of the ACM, Volume 30, Number 6,␊ | 
| June 1987, pages 520-540.␊ | 
| ␊ | 
| Ziv, J. and Lempel, A., "Compression of individual sequences via␊ | 
| variable-rate coding", IEEE Transactions on Information Theory,␊ | 
| Volume 24, Number 5, September 1978, pages 530-536.␊ | 
| ␊ | 
| ␊ | 
| APPENDIX A - AS/400 Extra Field (0x0065) Attribute Definitions␊ | 
| --------------------------------------------------------------␊ | 
| ␊ | 
| Field Definition Structure:␊ | 
| ␊ | 
| a. field length including length             2 bytes␊ | 
| b. field code                                2 bytes␊ | 
| c. data                                      x bytes␊ | 
| ␊ | 
| Field Code  Description␊ | 
| 4001     Source type i.e. CLP etc␊ | 
| 4002     The text description of the library ␊ | 
| 4003     The text description of the file␊ | 
| 4004     The text description of the member␊ | 
| 4005     x'F0' or 0 is PF-DTA,  x'F1' or 1 is PF_SRC␊ | 
| 4007     Database Type Code                  1 byte␊ | 
| 4008     Database file and fields definition␊ | 
| 4009     GZIP file type                      2 bytes␊ | 
| 400B     IFS code page                       2 bytes␊ | 
| 400C     IFS Creation Time                   4 bytes␊ | 
| 400D     IFS Access Time                     4 bytes␊ | 
| 400E     IFS Modification time               4 bytes␊ | 
| 005C     Length of the records in the file   2 bytes␊ | 
| 0068     GZIP two words                      8 bytes␊ | 
| ␊ | 
| APPENDIX B - z/OS Extra Field (0x0065) Attribute Definitions␊ | 
| ------------------------------------------------------------␊ | 
| ␊ | 
| Field Definition Structure:␊ | 
| ␊ | 
| a. field length including length             2 bytes␊ | 
| b. field code                                2 bytes␊ | 
| c. data                                      x bytes␊ | 
| ␊ | 
| Field Code  Description␊ | 
| 0001     File Type                           2 bytes ␊ | 
| 0002     NonVSAM Record Format               1 byte␊ | 
| 0003     Reserved␉␉␊ | 
| 0004     NonVSAM Block Size                  2 bytes Big Endian␊ | 
| 0005     Primary Space Allocation            3 bytes Big Endian␊ | 
| 0006     Secondary Space Allocation          3 bytes Big Endian␊ | 
| 0007     Space Allocation Type1 byte flag␉␉␊ | 
| 0008     Modification Date                   Retired with PKZIP 5.0 +␊ | 
| 0009     Expiration Date                     Retired with PKZIP 5.0 +␊ | 
| 000A     PDS Directory Block Allocation      3 bytes Big Endian binary value␊ | 
| 000B     NonVSAM Volume List                 variable␉␉␊ | 
| 000C     UNIT Reference                      Retired with PKZIP 5.0 +␊ | 
| 000D     DF/SMS Management Class             8 bytes EBCDIC Text Value␊ | 
| 000E     DF/SMS Storage Class                8 bytes EBCDIC Text Value␊ | 
| 000F     DF/SMS Data Class                   8 bytes EBCDIC Text Value␊ | 
| 0010     PDS/PDSE Member Info.               30 bytes␉␊ | 
| 0011     VSAM sub-filetype                   2 bytes␉␉␊ | 
| 0012     VSAM LRECL                          13 bytes EBCDIC "(num_avg num_max)"␊ | 
| 0013     VSAM Cluster Name                   Retired with PKZIP 5.0 +␊ | 
| 0014     VSAM KSDS Key Information           13 bytes EBCDIC "(num_length num_position)"␊ | 
| 0015     VSAM Average LRECL                  5 bytes EBCDIC num_value padded with blanks␊ | 
| 0016     VSAM Maximum LRECL                  5 bytes EBCDIC num_value padded with blanks␊ | 
| 0017     VSAM KSDS Key Length                5 bytes EBCDIC num_value padded with blanks␊ | 
| 0018     VSAM KSDS Key Position              5 bytes EBCDIC num_value padded with blanks␊ | 
| 0019     VSAM Data Name                      1-44 bytes EBCDIC text string␊ | 
| 001A     VSAM KSDS Index Name                1-44 bytes EBCDIC text string␊ | 
| 001B     VSAM Catalog Name                   1-44 bytes EBCDIC text string␊ | 
| 001C     VSAM Data Space Type                9 bytes EBCDIC text string␊ | 
| 001D     VSAM Data Space Primary             9 bytes EBCDIC num_value left-justified␊ | 
| 001E     VSAM Data Space Secondary           9 bytes EBCDIC num_value left-justified␊ | 
| 001F     VSAM Data Volume List               variable EBCDIC text list of 6-character Volume IDs␊ | 
| 0020     VSAM Data Buffer Space              8 bytes EBCDIC num_value left-justified␊ | 
| 0021     VSAM Data CISIZE                    5 bytes EBCDIC num_value left-justified␊ | 
| 0022     VSAM Erase Flag                     1 byte flag␉␉␊ | 
| 0023     VSAM Free CI %                      3 bytes EBCDIC num_value left-justified␊ | 
| 0024     VSAM Free CA %                      3 bytes EBCDIC num_value left-justified␊ | 
| 0025     VSAM Index Volume List              variable EBCDIC text list of 6-character Volume IDs␊ | 
| 0026     VSAM Ordered Flag                   1 byte flag␉␉␊ | 
| 0027     VSAM REUSE Flag                     1 byte flag␉␉␊ | 
| 0028     VSAM SPANNED Flag                   1 byte flag␉␉␊ | 
| 0029     VSAM Recovery Flag                  1 byte flag␉␉␊ | 
| 002A     VSAM  WRITECHK  Flag                1 byte flag␉␉␊ | 
| 002B     VSAM Cluster/Data SHROPTS           3 bytes EBCDIC "n,y"␉␊ | 
| 002C     VSAM Index SHROPTS                  3 bytes EBCDIC "n,y"␉␊ | 
| 002D     VSAM Index Space Type               9 bytes EBCDIC text string␊ | 
| 002E     VSAM Index Space Primary            9 bytes EBCDIC num_value left-justified␊ | 
| 002F     VSAM Index Space Secondary          9 bytes EBCDIC num_value left-justified␊ | 
| 0030     VSAM Index CISIZE                   5 bytes EBCDIC num_value left-justified␊ | 
| 0031     VSAM Index IMBED                    1 byte flag␉␉␊ | 
| 0032     VSAM Index Ordered Flag             1 byte flag␉␉␊ | 
| 0033     VSAM REPLICATE Flag                 1 byte flag␉␉␊ | 
| 0034     VSAM Index REUSE Flag               1 byte flag␉␉␊ | 
| 0035     VSAM Index WRITECHK Flag            1 byte flag Retired with PKZIP 5.0 +␊ | 
| 0036     VSAM Owner                          8 bytes EBCDIC text string␊ | 
| 0037     VSAM Index Owner                    8 bytes EBCDIC text string␊ | 
| 0038     Reserved␊ | 
| 0039     Reserved␊ | 
| 003A     Reserved␊ | 
| 003B     Reserved␊ | 
| 003C     Reserved␊ | 
| 003D     Reserved␊ | 
| 003E     Reserved␊ | 
| 003F     Reserved␊ | 
| 0040     Reserved␊ | 
| 0041     Reserved␊ | 
| 0042     Reserved␊ | 
| 0043     Reserved␊ | 
| 0044     Reserved␊ | 
| 0045     Reserved␊ | 
| 0046     Reserved␊ | 
| 0047     Reserved␊ | 
| 0048     Reserved␊ | 
| 0049     Reserved␊ | 
| 004A     Reserved␊ | 
| 004B     Reserved␊ | 
| 004C     Reserved␊ | 
| 004D     Reserved␊ | 
| 004E     Reserved␊ | 
| 004F     Reserved␊ | 
| 0050     Reserved␊ | 
| 0051     Reserved␊ | 
| 0052     Reserved␊ | 
| 0053     Reserved␊ | 
| 0054     Reserved␊ | 
| 0055     Reserved␊ | 
| 0056     Reserved␊ | 
| 0057     Reserved␊ | 
| 0058     PDS/PDSE Member TTR Info.           6 bytes  Big Endian␊ | 
| 0059     PDS 1st LMOD Text TTR               3 bytes  Big Endian␊ | 
| 005A     PDS LMOD EP Rec #                   4 bytes  Big Endian␊ | 
| 005B     Reserved␊ | 
| 005C     Max Length of records               2 bytes  Big Endian␊ | 
| 005D     PDSE Flag                           1 byte flag␊ | 
| 005E     Reserved␊ | 
| 005F     Reserved␊ | 
| 0060     Reserved␊ | 
| 0061     Reserved␊ | 
| 0062     Reserved␊ | 
| 0063     Reserved␊ | 
| 0064     Reserved␊ | 
| 0065     Last Date Referenced                4 bytes  Packed Hex "yyyymmdd"␊ | 
| 0066     Date Created                        4 bytes  Packed Hex "yyyymmdd"␊ | 
| 0068     GZIP two words                      8 bytes␊ | 
| 0071     Extended NOTE Location              12 bytes Big Endian␊ | 
| 0072     Archive device UNIT                 6 bytes  EBCDIC␊ | 
| 0073     Archive 1st Volume                  6 bytes  EBCDIC␊ | 
| 0074     Archive 1st VOL File Seq#           2 bytes  Binary␊ | 
| ␊ | 
| APPENDIX C - Zip64 Extensible Data Sector Mappings (EFS)␊ | 
| --------------------------------------------------------␊ | 
| ␊ | 
| -Z390   Extra Field:␊ | 
| ␊ | 
| The following is the general layout of the attributes for the ␊ | 
| ZIP 64 "extra" block for extended tape operations. Portions of ␊ | 
| this extended tape processing technology is covered under a ␊ | 
| pending patent application. The use or implementation in a ␊ | 
| product of certain technological aspects set forth in the ␊ | 
| current APPNOTE, including those with regard to strong encryption,␊ | 
| patching or extended tape operations, requires a license from␊ | 
| PKWARE.  Please contact PKWARE with regard to acquiring a license. ␊ | 
| ␊ | 
| ␊ | 
| Note: some fields stored in Big Endian format.  All text is ␊ | 
| ␉  in EBCDIC format unless otherwise specified.␊ | 
| ␊ | 
| Value       Size          Description␊ | 
| -----       ----          -----------␊ | 
| (Z390)  0x0065      2 bytes       Tag for this "extra" block type␊ | 
| Size        4 bytes       Size for the following data block␊ | 
| Tag         4 bytes       EBCDIC "Z390"␊ | 
| Length71    2 bytes       Big Endian␊ | 
| Subcode71   2 bytes       Enote type code␊ | 
| FMEPos      1 byte␊ | 
| Length72    2 bytes       Big Endian␊ | 
| Subcode72   2 bytes       Unit type code␊ | 
| Unit        1 byte        Unit␊ | 
| Length73    2 bytes       Big Endian␊ | 
| Subcode73   2 bytes       Volume1 type code␊ | 
| FirstVol    1 byte        Volume␊ | 
| Length74    2 bytes       Big Endian␊ | 
| Subcode74   2 bytes       FirstVol file sequence␊ | 
| FileSeq     2 bytes       Sequence ␊ | 
| ␊ | 
| APPENDIX D - Language Encoding (EFS)␊ | 
| ------------------------------------␊ | 
| ␊ | 
| The ZIP format has historically supported only the original IBM PC character ␊ | 
| encoding set, commonly referred to as IBM Code Page 437.  This limits storing ␊ | 
| file name characters to only those within the original MS-DOS range of values ␊ | 
| and does not properly support file names in other character encodings, or ␊ | 
| languages. To address this limitation, this specification will support the ␊ | 
| following change. ␊ | 
| ␊ | 
| If general purpose bit 11 is unset, the file name and comment should conform ␊ | 
| to the original ZIP character encoding.  If general purpose bit 11 is set, the ␊ | 
| filename and comment must support The Unicode Standard, Version 4.1.0 or ␊ | 
| greater using the character encoding form defined by the UTF-8 storage ␊ | 
| specification.  The Unicode Standard is published by the The Unicode␊ | 
| Consortium (www.unicode.org).  UTF-8 encoded data stored within ZIP files ␊ | 
| is expected to not include a byte order mark (BOM). ␊ | 
| ␊ | 
| Applications may choose to supplement this file name storage through the use ␊ | 
| of the 0x0008 Extra Field.  Storage for this optional field is currently ␊ | 
| undefined, however it will be used to allow storing extended information ␊ | 
| on source or target encoding that may further assist applications with file ␊ | 
| name, or file content encoding tasks.  Please contact PKWARE with any␊ | 
| requirements on how this field should be used.␊ | 
| ␊ | 
| The 0x0008 Extra Field storage may be used with either setting for general ␊ | 
| purpose bit 11.  Examples of the intended usage for this field is to store ␊ | 
| whether "modified-UTF-8" (JAVA) is used, or UTF-8-MAC.  Similarly, other ␊ | 
| commonly used character encoding (code page) designations can be indicated ␊ | 
| through this field.  Formalized values for use of the 0x0008 record remain ␊ | 
| undefined at this time.  The definition for the layout of the 0x0008 field␊ | 
| will be published when available.  Use of the 0x0008 Extra Field provides␊ | 
| for storing data within a ZIP file in an encoding other than IBM Code␊ | 
| Page 437 or UTF-8.␊ | 
| ␊ | 
| General purpose bit 11 will not imply any encoding of file content or␊ | 
| password.  Values defining character encoding for file content or ␊ | 
| password must be stored within the 0x0008 Extended Language Encoding ␊ | 
| Extra Field.␊ | 
| ␊ | 
| ␊ |