diff --git a/AUTHORS b/AUTHORS index e28a39b..df4cf52 100644 --- a/AUTHORS +++ b/AUTHORS @@ -1,50 +1,51 @@ Original version by: Dave Smith <dave.s@earthcorp.com> Dave Smith <davesmith@users.sourceforge.net> Current maintainer: Carl Byington <carl@five-ten-sg.com> With contributions by: Joseph Nahmias <jello@costa.debian.org> -- bounces Joseph Nahmias <joe@nahmias.net> Arne Ahrend <aahrend@web.de> Nigel Horne <njh@bandsman.co.uk> Chris Halls <halls@debian.org> Stevens Miller <smiller@novadatalabs.com> Brad Hards <bradh@frogmouth.net> Alexander Grau <alexandergrau@gmx.de> Antonio Palama <palama@inwind.it> Sean Loaring <sloaring@tec-man.com> James Woodcock Joachim Metz <joachim.metz@gmail.com> Robert Simpson <rsimpson@idiscoverglobal.com> Justin Greer <jgreer@nextpoint.com> Bharath Acharya <abharath@novell.com> Robert Harris <robert.f.harris@blueyonder.co.uk> David Cuadrado <krawek@gmail.com> Chris Eagle <cseagle@redshift.com> Fridrich Strba <fstrba@novell.com> Emmanuel Andry <eandry@mandriva.org> hggdh <hggdh2@gmail.com> bharder <bharder@methodlogic.net> Chris White <chris@soniannetworks.com> Roberto Polli <robipolli@gmail.com> Lee Ayres <ayres@interhack.com> Hugo DesRosiers <info@akralogic.com> Kenneth Berland <ken@hero.com> Leo 'costela' Antunes <costela@debian.org> Svante Signell <svante.signell@telia.com> Dominique Leuenberger a.k.a. Dimstar <dimstar@opensuse.org> Daniel Gryniewicz <dang@linuxbox.com> AJ Shankar <aj@everlaw.com> Jeffrey Morlan <jeffrey@everlaw.com> Hans Liss <Hans@Liss.pp.se> Igor Stroh <igor.stroh@rulim.de> Zachary Travis <ztravis@everlaw.com> Vitaliy Didik <ariman@inbox.ru> + Alfredo Esteban <aedelatorre@gmail.com> Testing team: Mac OSX - Michael Watson <mike@mikeandgayle.com> Cygwin/Mingw - Fridrich Strba <fstrba@novell.com> Cygwin - Chris Eagle <cseagle@redshift.com> diff --git a/ChangeLog b/ChangeLog index 2099a3f..833d2d3 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,931 +1,935 @@ +LibPST 0.6.72 (2017-12-07) +=============================== + * Alfredo Esteban - add -l and -f options to lspst. + LibPST 0.6.71 (2017-07-21) =============================== * Zachary Travis - Add support for the OST 2013 format, and Content-Disposition filename key fix for outlook compatibility LibPST 0.6.70 (2017-02-08) =============================== * Jeffrey Morlan - pst_getID2 must not recurse into children LibPST 0.6.69 (2016-10-29) =============================== * fix bugs in code allowing folders containing multiple item types LibPST 0.6.68 (2016-08-29) =============================== * allow folders containing multiple item types, e.g. email and calendar * better detection of valid internet headers LibPST 0.6.67 (2016-07-06) =============================== * Jeffrey Morlan - multiple bug fixes and an optimization see 'hg log -v' for details LibPST 0.6.66 (2015-12-21) =============================== * Igor Stroh - Added Content-ID header support LibPST 0.6.65 (2015-09-11) =============================== * Jeffrey Morlan - fix multiple Content-Type headers * Hans Liss - debug level output LibPST 0.6.64 (2015-03-09) =============================== * AJ Shankar fixes for attachment processing and body encodings that contain embedded null chars LibPST 0.6.63 (2013-12-27) =============================== * Daniel Gryniewicz found buffer overrun in LIST_COPY_TIME LibPST 0.6.62 (2013-09-22) =============================== * 983596 - Old dependency filter breaks file coloring LibPST 0.6.61 (2013-08-06) =============================== * move documentation to unversioned directory LibPST 0.6.60 (2013-06-12) =============================== * patch from Dominique Leuenberger to add AC_USE_SYSTEM_EXTENSIONS * add readpst -a option for attachment stripping LibPST 0.6.59 (2013-05-17) =============================== * add autoconf checking for libgsf LibPST 0.6.58 (2012-12-28) =============================== * fix From quoting on embedded rfc/822 messages LibPST 0.6.57 (2012-12-27) =============================== * remove useless dependencies LibPST 0.6.56 (2012-12-24) =============================== * merge -m .msg files code into main branch LibPST 0.6.55 (2012-05-08) =============================== * preserve bcc headers * document -C switch to set default character set * space after colon is not required in header fields LibPST 0.6.54 (2011-11-04) =============================== * embedded rfc822 messages might contain rtf encoded bodies LibPST 0.6.53 (2011-07-10) =============================== * add Status: header in output * allow fork for parallel processing of individual email folders in separate mode * proper handling of --with-boost-python option LibPST 0.6.52 (2011-05-22) =============================== * fix dangling freed pointer in embedded rfc822 message processing * allow broken outlook internet header field - it sometimes contains fragments of the message body rather than headers LibPST 0.6.51 (2011-04-17) =============================== * fix for buffer overrun; attachment size from the secondary list of mapi elements overwrote proper size from the primary list of mapi elements. fedora bugzilla 696263 LibPST 0.6.50 (2010-12-24) =============================== * rfc2047 and rfc2231 encoding for non-ascii headers and attachment filenames LibPST 0.6.49 (2010-09-13) =============================== * fix to ignore embedded objects that are not email messages LibPST 0.6.48 (2010-09-02) =============================== * fix for broken internet headers from Outlook. * fix ax_python.m4 to look for python2.7 * Subpackage Licensing, add COPYING to -libs. * use mboxrd from quoting for output formats with multiple messages per file * use no from quoting for output formats with single message per file LibPST 0.6.47 (2010-05-07) =============================== * patches from Kenneth Berland for solaris. * fix output file name numbering to start at 1 rather than 2. LibPST 0.6.46 (2010-02-13) =============================== * prefer libpthread over librt for finding sem_init function. * rebuild for fedora 13 change in implicit dso linking semantics. LibPST 0.6.45 (2009-11-18) =============================== * patch from Hugo DesRosiers to export categories and notes into vcards. * extend that patch to export categories into vcalendar appointments also. LibPST 0.6.44 (2009-09-20) =============================== * fix --help usage; readpstlog is gone, debug files are now ascii text. * patch from Lee Ayres to add file name extensions in separate mode. * allow mixed items types in a folder in separate mode. LibPST 0.6.43 (2009-09-12) =============================== * patches from Justin Greer. add code pages 1200 and 1201 to the list for iconv add support for 0x0201 indirect blocks that point to 0x0101 blocks add readpst -t option to select output item types fix (remove) extra new line inside headers * cleanup base64 encoding to remove duplicate code. * patch from Chris White to avoid segfault with embedded appointments. * patch from Roberto Polli to add creation of some Thunderbird specific meta files. * patch from Justin Greer to ignore b5 tables at offset zero. * output type filtering can now be used to handle folders with multiple item types. * better decoding of rfc822 embedded message attachments. * better detection of dsn delivery reports LibPST 0.6.42 (2009-09-03) =============================== * patch from Fridrich Strba to build with DJGPP DOS cross-compiler. LibPST 0.6.41 (2009-06-23) =============================== * fix ax_python detection - should not use locate command * checking for fedora versions is not needed LibPST 0.6.40 (2009-06-23) =============================== * fedora 11 has python2.6 * remove pdf version of the man pages LibPST 0.6.39 (2009-06-21) =============================== * fedora > 10 moved to boost-python-devel LibPST 0.6.38 (2009-06-21) =============================== * add python module interface to the shared library for easy scripting. * the shared library must never write to stdout or stderr. * fix pst_attach_to_mem so the caller does not need to initialize the buffer pointer. * remove readpst -C switch, obsolete debugging code. * update version to 4:0:0 since we made many changes to the interface. * removed contact->access_method since we don't have a mapi element for it. * changed pst_attach_to_mem to return pst_binary structure. * decode more recurrence mapi elements. * readpst changes for parallel operation on multi processor machines. * remove readpstlog - the debug log files are now plain ascii. Add locking if needed so parallel jobs can produce debug logs. * more cleanup of the shared library interface, but still not fully thread safe. * make nested mime multipart/alternative to hold the text/html parts so the topmost level is almost always multipart/mixed. * the shared library interface should now be thread safe. * patch from Fridrich Strba to build on win32. * remove unreferenced code. LibPST 0.6.37 (2009-04-17) =============================== * add pst_attach_to_mem() back into the shared library interface. * improve developer documentation. * fix memory leak caught by valgrind. LibPST 0.6.36 (2009-04-14) =============================== * spec file cleanup with multiple sub packages. * add doxygen devel-doc documentation for the shared library. * switch back to fully versioned subpackage dependencies. * more cleanup on external names in the shared object file. LibPST 0.6.35 (2009-04-08) =============================== * fix bug where we failed to pickup the last extended attribute. * patch from Emmanuel Andry to fix potential security bug in pst2dii with printf(err). * properly add trailing mime boundary in all modes. * move version-info into main configure.in, and set it properly * prefix all external symbols in the shared library with pst_ to avoid symbol clashes with other shared libraries. * new debianization from hggdh. * build separate libpst, libpst-libs, libpst-devel rpms. * remove many functions from the interface by making them static. LibPST 0.6.34 (2009-03-19) =============================== * improve consistency checking when fetching items from the pst file. * avoid putting mixed item types into the same output folder. LibPST 0.6.33 (2009-03-17) =============================== * fix fedora 11 type mismatch warning (actually an error in this case). * fix large file support, some sytems require config.h to be included earlier in the compilation. * compensate for iconv conversion to utf-7 that produces strings that are not null terminated. * don't produce empty attachment files in separate mode. LibPST 0.6.32 (2009-03-14) =============================== * fix ppc64 compile error. LibPST 0.6.31 (2009-03-14) =============================== * bump version for fedora cvs tagging mistake. LibPST 0.6.30 (2009-03-14) =============================== * improve documentation of .pst format. * remove decrypt option from getidblock - we always decrypt. * rename some structure fields to reflect our better understanding of the pst format. * track character set individually for each mapi element, since some could be unicode (therefore utf8) and others sbcs with character set specified by the mapi object. remove charset option from pst2ldif since we get that from each object now. * more code cleanup. * use AM_ICONV for better portability of the library location. * structure renaming to be more specific. * improve internal doxygen documentation. * avoid emitting bogus empty email messages into contacts and calendar files. LibPST 0.6.29 (2009-02-24) =============================== * fix for 64bit on Fedora 11 LibPST 0.6.28 (2009-02-24) =============================== * add X-libpst-forensic-* headers to capture items of interest that are not used by normal mail clients. * improve decoding of multipart/report and message/rfc822 mime types. * improve character set handling - don't try to convert utf-8 to single byte for fields that were not originally unicode. if the conversion fails, leave the data in utf-8. * fix embedded rfc822 messages with attachments. LibPST 0.6.27 (2009-02-07) =============================== * fix for const correctness on Fedora 11 LibPST 0.6.26 (2009-02-07) =============================== * patch from Fridrich Strba for building on mingw and general cleanup of autoconf files * add processing for pst files of type 0x0f * start adding support for properly building and installing libpst.so and the header files required to use it. * remove version.h since the version number is now in config.h * more const correctness issues regarding getopt() * consistent ordering of our include files. all system includes protected by ifdef HAVE_ from autoconf. * strip and regenerate all MIME headers to avoid duplicates. problem found by Michael Watson on Mac OSX. * do a better job of making unique MIME boundaries. * only use base64 coding when strictly necessary. * more cleanup of #include files. common.h is the only file allowed to include system .h files unprotected by autoconf HAVE_ symbols. define.h is the only other file allowed to include system .h files. define.h is never installed; common.h is installed if we are building the shared library. * recover dropped pragma pack line, use int64_t rather than off_t to avoid forcing users of the shared library to enable large file support. * add pragma packing support for sun compilers. * fix initial from header in mbox format. * start moving to PST_LE_GET* rather than LE*_CPU macros so we can eventually remove the pragma packing. * patch from Fridrich Strba, some systems need extra library for regex. LibPST 0.6.25 (2009-01-16) =============================== * improve handling of content-type charset values in mime parts LibPST 0.6.24 (2008-12-11) =============================== * patch from Chris Eagle to build on cygwin LibPST 0.6.23 (2008-12-04) =============================== * bump version to avoid cvs tagging mistake in fedora LibPST 0.6.22 (2008-11-28) =============================== * patch from David Cuadrado to process emails with type PST_TYPE_OTHER * base64_encode_multiple() may insert newline, needs larger malloc * subject lines shorter than 2 bytes could segfault LibPST 0.6.21 (2008-10-21) =============================== * fix title bug with old schema in pst2ldif. * also escape commas in distinguished names per rfc4514. LibPST 0.6.20 (2008-10-09) =============================== * add configure option --enable-dii=no to remove dependency on libgd. * many fixes in pst2ldif by Robert Harris. * add -D option to include deleted items, from Justin Greer * fix from Justin Greer to add missing email headers * fix from Justin Greer for my_stristr() * fix for orphan children when building descriptor tree * avoid writing uninitialized data to debug log file * remove unreachable code * create dummy top-of-folder descriptor if needed for corrupt pst files LibPST 0.6.19 (2008-09-14) =============================== * Fix base64 encoding that could create long lines * Initial work on a .so shared library from Bharath Acharya. LibPST 0.6.18 (2008-08-28) =============================== * Fixes for iconv on Mac from Justin Greer. LibPST 0.6.17 (2008-08-05) =============================== * More fixes for 32/64 bit portability on big endian ppc. LibPST 0.6.16 (2008-08-05) =============================== * Use inttypes.h for portable printing of 64 bit items. LibPST 0.6.15 (2008-07-30) =============================== * Patch from Robert Simpson for file handle leak in error case. * Fix for missing length on lz decompression, bug found by Chris White. LibPST 0.6.14 (2008-06-15) =============================== * Fix my mistake in debian packaging. LibPST 0.6.13 (2008-06-13) =============================== * Patch from Robert Simpson for encryption type 2. * Fix the order of testing item types to avoid claiming there are multiple message stores. LibPST 0.6.12 (2008-06-10) =============================== * Patch from Joachim Metz for debian packaging, and fix for incorrect length on lz decompression. LibPST 0.6.11 (2008-06-03) =============================== * Use ftello/fseeko to properly handle large files. * Document and properly use datasize field in b5 blocks. * Fix some MSVC compile issues and collect MSVC dependencies into one place. LibPST 0.6.10 (2008-05-29) =============================== * Patch from Robert Simpson <rsimpson@idiscoverglobal.com> fix doubly-linked list in the cache_ptr code, and allow arrays of unicode strings (without converting them). * More changes for Fedora packaging (#434727) * Fixes for const correctness. LibPST 0.6.9 (2008-05-16) =============================== * Patch from Joachim Metz <joachim.metz@gmail.com> for 64 bit compile. * Signed/unsigned cleanup from 'CFLAGS=-Wextra ./configure'. * Reindent vbuf.c to make it readable. * Fix pst format documentation for 8 byte backpointers. LibPST 0.6.8 (2008-03-05) =============================== * Initial version of pst2dii to convert to Summation dii load file format. * Changes for Fedora packaging (#434727) LibPST 0.6.7 (2008-02-16) =============================== * Work around bogus 7c.b5 blocks in some messages that have been read. They appear to have attachments, but of some unknown format. Before the message was read, it did not have any attachments. * Use autoscan to cleanup our autoconf system. * Use autoconf to detect when we need to use our XGetopt files and other header files. * More fields, including BCC. * Fix missing LE32_CPU byte swapping for FILETIME types. LibPST 0.6.6 (2008-01-31) =============================== * More code cleanup, removing unnecessary null terminations on binary buffers. All pst file reads now go thru one function. Logging all pst reads to detect cases where we read the same data multiple times - discovers node sizes are actually 512 bytes. * Switch from cvs to mercurial source control. LibPST 0.6.5 (2008-01-22) =============================== * More code cleanup, removing obsolete code. All the boolean flags of type 0xb have length 4, so these are all 32 bits in the file. Libpst treats them all as 16 bits, but at least we are consistent. * More fields decoded - for example, see <http://msdn2.microsoft.com/en-us/library/aa454925.aspx> We should be able to use that data for much more complete decoding. * Move the rpm group to Applications/Productivity consistent with Evolution. LibPST 0.6.4 (2008-01-19) =============================== * More fixes for Outlook 2003 64 bit parsing. We observed cases of compressed RTF bodies (type 0x1009) with zero length. * Document type 0x0101 descriptor blocks and process them. * Fix large file support - we need to include config.h before any standard headers. * Merge following changes from svn snapshot from Alioth: * Add new fields to appointment for recurring events (SourceForge #304198) * Map IPM.Task items to PST_TYPE_TASK. * Applied patch to remove compiler warnings, thanks! (SourceForge #304314) * Fix crash with unknown reference type * Fix more memory issues detected by valgrind * lspst - add usage mesage and option parsing using getopt (SourceForge #304199) * Fix crash caused by invalid free calls * Fix crash when email subject is empty * Fix memory and information leak in hex debug dump LibPST 0.6.3 (2008-01-13) =============================== * More type consistency issues found by splint. LibPST 0.6.2 (2008-01-12) =============================== * More fixes for Outlook 2003 64 bit parsing. * All buffer sizes changed to size_t, all file offsets changed to off_t, all function names start with pst_, many other type consistency issues found by splint. Many changes to #llx in debug printing for 64 bit items. All id values are now uint64_t. LibPST 0.6.1 (2008-01-06) =============================== * Outlook 2003 64 bit parsing. Some documentation from Alexander Grau <alexandergrau@gmx.de> and patches from Sean Loaring <sloaring@tec-man.com>. * fix from Antonio Palama <palama@inwind.it> for email items that happen to have item->contact non null, and were being processed as contacts. * Add large file support so we can read .pst files larger than 2gb. * Change lspst to be similar to readpst, properly using recursion to walk the tree, and testing item types. Add a man page for lspst. LibPST 0.5.12 (2007-10-02) =============================== * security fix from Brad Hards <bradh@frogmouth.net> for buffer overruns in liv-zemple decoding for corrupted or malicious pst files. LibPST 0.5.11 (2007-08-24) =============================== * fix from Stevens Miller <smiller@novadatalabs.com> for unitialized variable. LibPST 0.5.10 (2007-08-20) =============================== * fix yet more valgrind errors - finally have a clean memory check. * restructure readpst.c for proper recursive tree walk. * buffer overrun test was backwards, introduced at 0.5.6 * fix broken email attachments, introduced at 0.5.6 LibPST 0.5.9 (2007-08-12) =============================== * fix more valgrind errors. LibPST 0.5.8 (2007-08-10) =============================== * fix more valgrind errors. lzfu_decompress needs to return the actual buffer size, since the lz header overestimates the size. This caused base64_encode to encode undefined bytes into the email attachment. LibPST 0.5.7 (2007-08-09) =============================== * fix valgrind errors, using uninitialized data. * improve debug logging and readpstlog for indented listings. * cleanup documentation. LibPST 0.5.6 (2007-07-15) =============================== * Fix to allow very small pst files with only one node in the tree. We were mixing signed/unsigned types in comparisons. * More progress decoding the basic structure 7c blocks. Many four byte values may be ID2 indices with data outside the buffer. * Start using doxygen to generate internal documentation. LibPST 0.5.5 (2007-07-10) =============================== * merge the following changes from Joe Nahmias version: * Lots of memory fixes. Thanks to Nigel Horne for his assistance tracking these down! * Fixed creation of vCards from contacts, thanks to Nigel Horne for his help with this! * fix for MIME multipart/alternative attachments. * added -c options to readpst manpage. * use 8.3 attachment filename if long filename isn't available. * new -b option to skip rtf-body.rtf attachments. * fix format of From header lines in mbox files. * Add more appointment fields, thanks to Chris Halls for tracking them down! LibPST 0.5.4 (2006-02-25) =============================== * patches from Arne, adding MH mode, remove leading zeros from the generated numbered filenames starting with one rather than zero. Miscellaneous code cleanup. * document the "7c" descriptor block format. LibPST 0.5.3 (2006-02-20) =============================== * switch to gnu autoconf/automake. This breaks the MS VC++ projects since the source code is now in the src subdirectory. * documentation switched to xml, building man pages and html from the master xml copy. * include rpm .spec file for building src and binary rpms. LibPST 0.5.2 (2006-02-18) =============================== * Added pst2ldif to convert the contacts to ldif format for import into ldap databases. * Major changes to libpst.c to properly use the node depth values from the b-tree nodes. We also use the item count values in the nodes rather than trying to guess how many items are active. * Cleanup whitespace - using tabs for every four columns. LibPST 0.5.1 (17 November 2004) =============================== Well, alot has happened since the last release of libpst. Release / Management: * The project has forked! The new maintainer is Joseph Nahmias. * We have changed hosting sites, thanks to sourceforge for hosting to this point. From this point forward we will be using alioth.debian.org. * The project is now using SubVersioN for source control. You can get the latest code by running: svn co svn://svn.debian.org/svn/libpst/trunk . * See <http://lists.alioth.debian.org/pipermail/libpst-devel/2004-November/000000.html> for more information. Code Changes: * Added lspst program to list items in a PST. Still incomplete. * Added vim folding markers to readpst.c * avoid the pseudo-prologue that MS prepends to the email headers * fix build on msvc, since it doesn't have sys/param.h * Re-vamped Makefile: * Only define CFLAGS in Makefileif missing * fixed {un,}install targets in Makefile * Fixed up build process in Makefile * Added mozilla conversion script from David Binard * Fixed bogus creation of readpst.log on every invocation * escaped dashes and apostrophe in manpages * Updated TODO * added manpages from debian pkg * fix escaped-string length count to consider '\n', thanks to Paul Bakker <bakker@fox-it.com>. * ensure there's a blank line between header and body patch from <johnh@aproposretail.com> (SourceForge #890745). * Apply accumulated endian-related patches * Removed unused files, upstream's debian/ dir -- Joe Nahmias <joe@nahmias.net> LibPST v0.5 =========== It is with GREAT relief that I bring you version 0.5 of the LibPST tools! Through great difficulties, this tool has survived and expanded to become even better. The changes are as follows: * RTF support. We can now decompress RTF bodies in emails, and are saved as attachments * Better support in reading the indexes. Fixed many bugs with them * Improved reliability. "Now we are getting somewhere!" * Improved compiling. Hopefully we won't be hitting too many compile errors now. * vCard handling. Contacts are now exported as vCard entries. * vEvent handling. Support has begun on exporting Calendar entries as events * Support for Journal entries has also begun If you have any problems with this release, don't hesitate to contact me. These changes come to you, as always, free under the GPL license!! What a wonderful thing it is. It does mean that you can write your own program off of this library and distribute it also for free. However, anyone with commercial interests for developing applications they will be charging for are encouraged to get in touch with me, as I am sure we can come to some arrangement. Dave Smith <dave.s@earthcorp.com> LibPST v0.4.3 ============= Bug fix release. No extra functionality Dave Smith <dave.s@earthcorp.com> LibPST v0.4.2 ============= The debug system has had an overhaul. The debug messages are no longer printed to the screen when they are enabled. They are dumped to a binary file. There is another utility called "readlog" that I have written to handle these log files. It should make it easier to selectively view bits of a log file. It also shows the position that the log message was printed from. There is a new switch in readpst. It is -d. It enables the user to specify the log file which the binary log is written to. If the switch isn't used, the default file of "readpst.log" is used. The code is now Visual C++ compatible. It has compiled on Visual C++ .net Standard edition, and produces the readpst.exe file. Use the project file included in this distribution. There have been minor improvements elsewhere too. LibPST v0.4.1 ============= Fixed a couple more bugs. Is it me or do bugs just insert themselves in random, hard to find places! Cured a few problems with regard to emails with multiple embeded items. They are not fully re-created using Mime-types, but are accessible with the -S switch (which saves everything as seperate items) Fixed a problem reading the first index. Back sliders are now detected. (ie when the value following the current one is smaller, not bigger!) Added some error messages when we try and read outside of the PST file, this was causing a few problems before, cause the return value wasn't always checked, so it was possible to be reading random data, and trying to make sense of it! Anyway, if you find any problems, don't hesitate to mail me Dave Smith <dave.s@earthcorp.com> LibPST v0.4 =========== Fixed a nasty bug that occasionally corrupted attachments. Another bug with regard to reading of indexes (also occasional). Another output method has been added which is called "Seperate". It is activated with the -S switch. It operates in the following manor: |--Inbox-->000000 | 000001 | 000002 |--Sentmail-->0000000 | 0000001 | 0000002 All the emails are stored in seperate files counting from 0 upwards, in a folder named as the PST folder. When an email has an attachment, it is saved as a seperate file. The filename for the attachment is made up of 2 parts, the first is the email number to which it belongs, the second is its filename. The should now be runnable on big-endian machines, if the define.h file is first modified. The #define LITTLE_ENDIAN must be commented out, and the #define BIG_ENDIAN must be uncommented. More verbose error messages have been added. Apparently people got confused when the program stopped for no visible reason. This has now been resolved. Thanks for the continued support of all people involved. Dave Smith <dave.s@earthcorp.com> Libpst v0.3.4 ============= Several more fixes. An Infinite loop and incorrect interpreting of item index attributes. Work has started on making the code executable on big endian CPUs. At present it should work with Linux on these CPUs, but I would appreciate it if you could provide feedback with regard to it's performance. I am also working with some other people at make it operate on Solaris. A whole load more items are now recognized by the Item records. With more items in Emails and Folders. I haven't got to the Contacts yet. Anyway, this is what I would call a minor feature enhancment and bugfix release. Dave Smith <dave.s@earthcorp.com> LibPST v0.3.3 ============= Fixed several items. Mainly memory leaks. Loads of them! oops.. I have added a new program, mainly of debugging, which when passed an ID value and a pst file, will extract and decrypt that ID from the pst file. I don't see it being a huge attraction, or of much use to most people, but it is another example of writing an application to use the libpst interface. Another fix was in the reading of the item index. This has hopefully now been corrected. The result of this bug was that not all the emails in a folder were converted. Hopefully you should have more luck now. Dave Smith <dave.s@earthcorp.com> LibPST v0.3.2 ============= Quick bugfix release. There was a bug in the decryption of the basic encryption that outlook uses. One byte, 0x6c, was incorrectly decrypted to 0x6c instead of 0xcd. This release fixes this bug. Sorry... LibPST v0.3.1 ============= Minor improvements. Fixed bug when linking multiple blocks together, so now the linking blocks are not "encrypted" when trying to read them. LibPST v0.3 =========== A lot of bug fixing has been done for this release. Testing has been done on the creation of the files by readpst. Better handling of large binaries being extracted from the PST file has been implemented. Quite a few reports have come in about not being able to compile on Darwin. This could be down to using macros with variable parameter lists. This has now been changed to use C functions with variable parameters. I hope this fixes a lot of problems. Added support for recreating the folder structure into normal directories. For Instance: Personal Folders |-Inbox | |-Jokes | |-Meetings |-Send Items each folder containing an mbox file with the correct emails for that folder. Dave Smith <dave.s@earthcorp.com> LibPST v0.3 beta1 ================= Again, a shed load of enhancements. More work has been done on the mime creation. A bug has been fixed that was letting part of the attachments that were created disappear. A major enhancement is that "compressible encryption" support has been added. This was an incredibly simple method to use. It is basically a ceasar cipher. It has been noted by several users already that the PST password that Outlook uses, serves *no purpose*. It is not used to encrypt the PST, it is mearly stored there. This means that the readpst application is able to convert PST files without knowing the password. Microsoft have some explaning to do! Output files are now not overwritten if they already exist. This means that if you have two folders in your PST file named "fred", the first one encountered will be named "fred" and the second one will be named "fred00000001". As you can see, there is enough room there for many duplicate names! Output filenames are now restricted. Any "/" or "\" characters in the name are replaced with "_". If you find that there are any other characters that need to be changed, could you please make me aware! Thanks to Berry Wizard for help with supporting the encryption. Thanks to Auke Kok, Carolus Walraven and Yogesh Kumar Guatam for providing debugging information and testing. Dave Smith <dave.s@earthcorp.com> LibPST v0.2 beta1 ================= Hello once more... Attachments are now re-created in mime format. The method is very crude and could be prone to over generalisation. Please test this version, and if attachments are not recreated correctly, please send me the email (complete message source) of the original and converted. Cheers. I hope this will work for everyone who uses this program, but reality can be very different! Let us see how it goes... Dave Smith <dave.s@earthcorp.com> LibPST v0.2 alpha1 =========== Hello! Some improvements. The internal code has been changed so that attachments are now processed and loaded into the structures. The readpst program is not finished yet. It needs to convert these binary structs into mime data. At present it just saves them to the current directory, overwriting any previous files with the attachment name. Improvements over previous version: * KMail output is supported - if the "-k" flag is specified, all the directory hierarchy is created using the KMail standard * Lots of bugs and memory leaks fixed Usage: ReadPST v0.2alpha1 implementing LibPST v0.2alpha1 Usage: ./readpst [OPTIONS] {PST FILENAME} OPTIONS: -h - Help. This screen -k - KMail. Output in kmail format -o - Output Dir. Directory to write files to. CWD is changed *after* opening pst file -V - Version. Display program version If you want to view lots of debug output, modify a line in "define.h" from "//#define DEBUG_ALL" to "#define DEBUG_ALL". It would then be advisable to pipe all output to a log file: ./readpst -o out pst_file &> logfile Dave Smith LibPST v0.1 =========== Hi Folks! This has been a long, hard slog, but I now feel that I have got somewhere useful. The included program "main" is able to read an Outlook PST file and dump the emails into mbox files, separating each folder into a different mbox file. All the mbox files are stored in the current directory and no attempt is yet made to organise these files into a directory hierarchy. This would not be too difficult to achieve though. Email attachments are not yet handled, neither are Contacts. There is no pretty interface yet, but you can convert a PST file in the following manner ./main {path to PST file} This is very much a work in progress, but I thought I should release this code so that people can lose their conception that outlook files will never be converted to Linux. I am intending that the code I am writing will be developed into greater applications to provide USEFUL tools for accessing and converting PST files into a variety of formats. One point I feel I should make is that Outlook, by default, creates "Compressible Encryption" PST files. I have not, as yet, attempted to write any decryption routines, so you will not be able to convert these files. However, if you create a new PST file and choose not to make an encrypted one, you can copy all your emails into this new one and then convert the unencrypted one. I hope you enjoy, Dave Smith diff --git a/configure.in b/configure.in index 67f0b76..77c2f87 100644 --- a/configure.in +++ b/configure.in @@ -1,393 +1,393 @@ AC_PREREQ(2.60) -AC_INIT(libpst,0.6.71,carl@five-ten-sg.com) +AC_INIT(libpst,0.6.72,carl@five-ten-sg.com) AC_CONFIG_SRCDIR([src/libpst.c]) AC_CONFIG_HEADER([config.h]) AC_CONFIG_MACRO_DIR([m4]) AM_INIT_AUTOMAKE AC_CANONICAL_HOST AC_USE_SYSTEM_EXTENSIONS # # 1. Remember that version-info is current:revision:age, and age <= current. # 2. If the source code has changed at all since the last public release, # then increment revision (`c:r:a' becomes `c:r+1:a'). # 3. If any interfaces have been added, removed, or changed since the last # update, increment current, and set revision to 0. # 4. If any interfaces have been added since the last public release, then # increment age, since we should be backward compatible with the previous # version. # 5. If any interfaces have been removed or changed since the last public # release, then set age to 0, since we are not backward compatible. # 6. libtool will build libpst.so.x.y.z where the SONAME is libpst.so.x # and x=current-age, y=age, z=revision libpst_version_info='5:14:1' AC_SUBST(LIBPST_VERSION_INFO, [$libpst_version_info]) libpst_so_major='4' AC_SUBST(LIBPST_SO_MAJOR, [$libpst_so_major]) # libpst # version soname so library name # 0.6.35 libpst.so.2 libpst.so.2.0.0 # 0.6.37 libpst.so.2 libpst.so.2.1.0 # 0.6.38 libpst.so.2 libpst.so.2.1.0 # 0.6.40 libpst.so.4 libpst.so.4.0.0 # 0.6.43 libpst.so.4 libpst.so.4.0.1 # 0.6.47 libpst.so.4 libpst.so.4.0.2 # 0.6.48 libpst.so.4 libpst.so.4.0.3 # 0.6.49 libpst.so.4 libpst.so.4.0.4 # 0.6.50 libpst.so.4 libpst.so.4.1.0 # 0.6.51 libpst.so.4 libpst.so.4.1.1 # 0.6.52 libpst.so.4 libpst.so.4.1.2 # 0.6.53 libpst.so.4 libpst.so.4.1.3 # 0.6.54 libpst.so.4 libpst.so.4.1.4 # 0.6.55 libpst.so.4 libpst.so.4.1.5 # 0.6.56 libpst.so.4 libpst.so.4.1.6 # 0.6.57 libpst.so.4 libpst.so.4.1.6 # 0.6.58 libpst.so.4 libpst.so.4.1.7 # 0.6.59 libpst.so.4 libpst.so.4.1.8 # 0.6.60 libpst.so.4 libpst.so.4.1.9 # 0.6.61 libpst.so.4 libpst.so.4.1.9 # 0.6.62 libpst.so.4 libpst.so.4.1.9 # 0.6.63 libpst.so.4 libpst.so.4.1.10 # 0.6.66 libpst.so.4 libpst.so.4.1.11 # 0.6.67 libpst.so.4 libpst.so.4.1.12 # 0.6.68 libpst.so.4 libpst.so.4.1.13 # 0.6.69 libpst.so.4 libpst.so.4.1.14 # Check for solaris AC_MSG_CHECKING([for Solaris]) case "$host" in *solaris*) os_solaris=yes ;; *) os_solaris=no ;; esac AC_MSG_RESULT($os_solaris) AM_CONDITIONAL(OS_SOLARIS, [test "$os_solaris" = "yes"]) # Check for win32 AC_MSG_CHECKING([for Win32]) case "$host" in *-mingw*) os_win32=yes ;; *) os_win32=no ;; esac AC_MSG_RESULT($os_win32) AM_CONDITIONAL(OS_WIN32, [test "$os_win32" = "yes"]) # Check for Win32 platform AC_MSG_CHECKING([for Win32 platform in general]) case "$host" in *-cygwin*) platform_win32=yes ;; *) platform_win32=$os_win32 ;; esac AC_MSG_RESULT($platform_win32) AM_CONDITIONAL(PLATFORM_WIN32, [test "$platform_win32" = "yes"]) # Checks for programs. # The following lines adds the --enable-dii option to configure: # # Give the user the choice to enter one of these: # --enable-dii # --enable-dii=yes # --enable-dii=no # AC_MSG_CHECKING([whether we are enabling dii utility]) AC_ARG_ENABLE(dii, AC_HELP_STRING([--enable-dii], [enable dii utility]), [ case "${enableval}" in yes) ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --enable-dii) ;; esac ], # default if not specified enable_dii=yes ) AC_MSG_RESULT([$enable_dii]) AC_PATH_PROG(CONVERT, convert) if test "x$CONVERT" = "x" ; then if test "$enable_dii" = "yes"; then enable_dii=no AC_MSG_WARN([convert program not found. pst2dii disabled]) fi else if test "x`$CONVERT --version 2>&1 | grep -i imagemagick >/dev/null ; echo $?`" != "x0"; then if test "$enable_dii" = "yes"; then enable_dii=no AC_MSG_WARN([wrong convert program found. pst2dii disabled]) fi fi fi AC_CHECK_HEADER([gd.h], [ AC_DEFINE([HAVE_GD_H], [1], [Define to 1 if you have the <gd.h> header file.]) ], [ if test "$enable_dii" = "yes"; then enable_dii=no AC_MSG_WARN([gd.h not found. pst2dii disabled]) fi ]) AM_CONDITIONAL(BUILD_DII, [test "$enable_dii" = "yes"]) # Checks for programs. AC_PROG_CXX AC_PROG_CC AM_PROG_CC_C_O AC_PROG_CPP AC_PROG_INSTALL AC_PROG_LN_S AC_PROG_LIBTOOL AC_PROG_MAKE_SET # make sure we get large file support AC_SYS_LARGEFILE AC_CHECK_SIZEOF(off_t) # Checks for header files. AC_CHECK_HEADER([unistd.h], AM_CONDITIONAL(NEED_XGETOPT, [test yes = no]), AM_CONDITIONAL(NEED_XGETOPT, [test yes = yes]) ) AC_HEADER_DIRENT AC_HEADER_STDC AC_CHECK_HEADERS([ctype.h dirent.h errno.h fcntl.h inttypes.h limits.h regex.h semaphore.h signal.h stdarg.h stdint.h stdio.h stdlib.h string.h sys/param.h sys/shm.h sys/stat.h sys/types.h time.h unistd.h wchar.h]) AC_SEARCH_LIBS([sem_init],[pthread rt]) # Checks for typedefs, structures, and compiler characteristics. AC_HEADER_STDBOOL AC_HEADER_SYS_WAIT AC_C_CONST AC_C_INLINE AC_TYPE_OFF_T AC_TYPE_SIZE_T AC_TYPE_PID_T AC_STRUCT_TM # Checks for library functions. AC_FUNC_FORK AC_FUNC_FSEEKO AC_FUNC_STAT AC_FUNC_LSTAT AC_FUNC_LSTAT_FOLLOWS_SLASHED_SYMLINK if test "$cross_compiling" != "yes"; then AC_FUNC_MALLOC AC_FUNC_MKTIME AC_FUNC_REALLOC fi AC_FUNC_STRFTIME AC_FUNC_VPRINTF AC_CHECK_FUNCS([chdir getcwd memchr memmove memset regcomp strcasecmp strncasecmp strchr strdup strerror strpbrk strrchr strstr strtol get_current_dir_name]) AM_ICONV if test "$am_cv_func_iconv" != "yes"; then AC_MSG_ERROR([libpst requires iconv which is missing]) fi AC_CHECK_FUNCS(regexec,,[AC_CHECK_LIB(regex,regexec, [REGEXLIB=-lregex AC_DEFINE(HAVE_REGEXEC,1,[Define to 1 if you have the regexec function.])], [AC_MSG_ERROR([No regex library found])])]) AC_SUBST(REGEXLIB) # The following lines adds the --enable-pst-debug option to configure: # # Give the user the choice to enter one of these: # --enable-pst-debug # --enable-pst-debug=yes # --enable-pst-debug=no # AC_MSG_CHECKING([whether we are forcing debug dump file creation]) AC_ARG_ENABLE(pst-debug, AC_HELP_STRING([--enable-pst-debug], [force debug dump file creation]), [ case "${enableval}" in yes) ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --enable-pst-debug) ;; esac ], # default if not specified enable_pst_debug=no ) AC_MSG_RESULT([$enable_pst_debug]) if test "$enable_pst_debug" = "yes"; then AC_DEFINE(DEBUG_ALL, 1, Define to 1 to force debug dump file creation) fi # The following lines adds the --enable-libpst-shared option to configure: # # Give the user the choice to enter one of these: # --enable-libpst-shared # --enable-libpst-shared=yes # --enable-libpst-shared=no # AC_MSG_CHECKING([whether we are building libpst shared object]) AC_ARG_ENABLE(libpst-shared, AC_HELP_STRING([--enable-libpst-shared], [build libpst shared object]), [ case "${enableval}" in yes) ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --enable-libpst-shared) ;; esac ], # default if not specified enable_libpst_shared=no ) AC_MSG_RESULT([$enable_libpst_shared]) enable_static_tools=yes if test "$enable_libpst_shared" = "yes"; then enable_shared=yes enable_static_tools=no fi # needed by STATIC_TOOLS in src/Makefile.am AC_SUBST(PST_OBJDIR, [$objdir]) # The following lines adds the --enable-static-tools option to configure: # # Give the user the choice to enter one of these: # --enable-static-tools # --enable-static-tools=yes # --enable-static-tools=no # AC_MSG_CHECKING([whether to link command line tools with libpst statically]) AC_ARG_ENABLE([static-tools], AC_HELP_STRING([--enable-static-tools], [link command line tools with libpst statically]), [ case "${enableval}" in yes) ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --enable-static-tools) ;; esac ], [ enable_static_tools=no ]) AC_MSG_RESULT([$enable_static_tools]) AM_CONDITIONAL(STATIC_TOOLS, [test "$enable_static_tools" = "yes"]) if test "$enable_static_tools" = "yes"; then enable_static="yes" fi # The following lines adds the --enable-python option to configure: # # Give the user the choice to enter one of these: # --enable-python # --enable-python=yes # --enable-python=no # AC_MSG_CHECKING([whether to build the libpst python interface]) AC_ARG_ENABLE([python], AC_HELP_STRING([--enable-python], [build libpst python interface]), [ case "${enableval}" in yes) ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --python) ;; esac ], [ enable_python=yes ]) AC_MSG_RESULT([$enable_python]) AM_CONDITIONAL(PYTHON_INTERFACE, [test "$enable_python" = "yes"]) if test "$enable_python" = "yes"; then enable_shared="yes" # get the version of installed python AX_PYTHON if test "$ax_python_bin" = "no"; then AC_MSG_ERROR(python binary not found) fi py_ver=`echo $ax_python_bin | cut -c7-` # find the flags for that version AC_PYTHON_DEVEL([$py_ver]) PYTHON_INCLUDE_DIR=`echo $python_path | cut -c3-` AC_SUBST([PYTHON_INCLUDE_DIR]) # do we have boost python AX_BOOST_PYTHON if test "$ac_cv_boost_python" = "no"; then AC_MSG_ERROR(boost python not found) fi AC_SUBST(PYTHON_VERSION, [$ax_python_bin]) fi # The following lines adds the --enable-profiling option to configure: # # Give the user the choice to enter one of these: # --enable-profiling # --enable-profiling=yes # --enable-profiling=no # AC_MSG_CHECKING([whether to link with gprof profiling]) AC_ARG_ENABLE([profiling], AC_HELP_STRING([--enable-profiling], [link with gprof profiling]), [ case "${enableval}" in yes) CFLAGS="$CFLAGS -pg" CPPFLAGS="$CPPFLAGS -pg" CXXFLAGS="$CXXFLAGS -pg" ;; no) ;; *) AC_MSG_ERROR(bad value ${enableval} for --profiling) ;; esac ], [ enable_profiling=no ]) AC_MSG_RESULT([$enable_profiling]) AM_CONDITIONAL(GPROF_PROFILING, [test "$enable_profiling" = "yes"]) gsf_flags="`pkg-config libgsf-1 --cflags`" gsf_libs="`pkg-config libgsf-1 --libs`" if test "$gsf_flags" = ""; then AC_MSG_ERROR(libgsf not found) fi AC_SUBST(GSF_FLAGS, [$gsf_flags]) AC_SUBST(GSF_LIBS, [$gsf_libs]) PKG_CHECK_MODULES([ZLIB], [zlib]) AC_OUTPUT( \ Makefile \ html/Makefile \ libpst.pc \ libpst.spec \ man/Makefile \ src/Makefile \ src/pst2dii.cpp \ python/Makefile \ xml/Makefile \ xml/libpst \ ) diff --git a/src/define.h b/src/define.h index 44f2c6f..df6fcc3 100644 --- a/src/define.h +++ b/src/define.h @@ -1,260 +1,261 @@ /*** * define.h * Part of the LibPST project * Written by David Smith * dave.s@earthcorp.com */ #ifndef DEFINEH_H #define DEFINEH_H #ifdef HAVE_CONFIG_H #include "config.h" #endif #include "libpst.h" #include "timeconv.h" #include "libstrfunc.h" #include "vbuf.h" #ifdef HAVE_STRING_H #include <string.h> #endif #ifdef HAVE_CTYPE_H #include <ctype.h> #endif #ifdef HAVE_LIMITS_H #include <limits.h> #endif #ifdef HAVE_WCHAR_H #include <wchar.h> #endif #ifdef HAVE_SIGNAL_H #include <signal.h> #endif #ifdef HAVE_ERRNO_H #include <errno.h> #endif #ifdef HAVE_ICONV #include <iconv.h> #endif #ifdef HAVE_REGEX_H #include <regex.h> #endif #ifdef HAVE_GD_H #include <gd.h> #endif #define PERM_DIRS 0777 #ifdef _WIN32 #include <direct.h> #define D_MKDIR(x) mkdir(x) #define chdir _chdir #define strcasecmp _stricmp #define vsnprintf _vsnprintf #define snprintf _snprintf #ifdef _MSC_VER #define ftello _ftelli64 #define fseeko _fseeki64 #elif defined (__MINGW32__) #define ftello ftello64 #define fseeko fseeko64 #else #error Only MSC and mingw supported for Windows #endif #ifndef UINT64_MAX #define UINT64_MAX ((uint64_t)0xffffffffffffffff) #endif #ifndef PRIx64 #define PRIx64 "I64x" #endif int __cdecl _fseeki64(FILE *, __int64, int); __int64 __cdecl _ftelli64(FILE *); #ifdef __MINGW32__ #include <getopt.h> #else #include "XGetopt.h" #endif #include <process.h> #undef gmtime_r #define gmtime_r(tp,tmp) (gmtime(tp)?(*(tmp)=*gmtime(tp),(tmp)):0) #define ctime_r(tp,tmp) (ctime(tp)?(strcpy((tmp),ctime((tp))),(tmp)):0) #else #ifdef __DJGPP__ #define gmtime_r(tp,tmp) (gmtime(tp)?(*(tmp)=*gmtime(tp),(tmp)):0) #define ctime_r(tp,tmp) (ctime(tp)?(strcpy((tmp),ctime((tp))),(tmp)):0) #define fseeko(stream, offset, whence) fseek(stream, (long)offset, whence) #define ftello ftell #endif #ifdef HAVE_UNISTD_H #include <unistd.h> #else #include "XGetopt.h" #endif #define D_MKDIR(x) mkdir(x, PERM_DIRS) #endif #ifdef HAVE_SYS_STAT_H #include <sys/stat.h> #endif #ifdef HAVE_SYS_TYPES_H #include <sys/types.h> #endif #ifdef HAVE_SYS_SHM_H #include <sys/shm.h> #endif #ifdef HAVE_SYS_WAIT_H #include <sys/wait.h> #endif #ifdef HAVE_DIRENT_H #include <dirent.h> #endif #ifdef HAVE_SEMAPHORE_H #include <semaphore.h> #endif void pst_debug_lock(); void pst_debug_unlock(); void pst_debug_setlevel(int level); void pst_debug_init(const char* fname, void* output_mutex); void pst_debug_func(int level, const char* function); void pst_debug_func_ret(int level); void pst_debug(int level, int line, const char *file, const char *fmt, ...); void pst_debug_hexdump(int level, int line, const char *file, const char* buf, size_t size, int cols, int delta); void pst_debug_hexdumper(FILE* out, const char* buf, size_t size, int cols, int delta); void pst_debug_close(); void* pst_malloc(size_t size); void *pst_realloc(void *ptr, size_t size); #define MESSAGEPRINT1(...) pst_debug(1, __LINE__, __FILE__, __VA_ARGS__) #define MESSAGEPRINT2(...) pst_debug(2, __LINE__, __FILE__, __VA_ARGS__) #define MESSAGEPRINT3(...) pst_debug(3, __LINE__, __FILE__, __VA_ARGS__) #define WARN(x) { \ MESSAGEPRINT3 x; \ pst_debug_lock(); \ printf x; \ fflush(stdout); \ pst_debug_unlock(); \ } #define DIE(x) { \ WARN(x); \ exit(EXIT_FAILURE); \ } #define DEBUG_WARN(x) MESSAGEPRINT3 x #define DEBUG_INFO(x) MESSAGEPRINT2 x #define DEBUG_HEXDUMP(x, s) pst_debug_hexdump(1, __LINE__, __FILE__, (char*)x, s, 0x10, 0) #define DEBUG_HEXDUMPC(x, s, c) pst_debug_hexdump(1, __LINE__, __FILE__, (char*)x, s, c, 0) #define DEBUG_ENT(x) \ { \ pst_debug_func(1, x); \ pst_debug(1, __LINE__, __FILE__, "Entering function\n"); \ } #define DEBUG_RET() \ { \ pst_debug(1, __LINE__, __FILE__, "Leaving function\n"); \ pst_debug_func_ret(1); \ } #define DEBUG_INIT(fname,mutex) {pst_debug_init(fname,mutex);} #define DEBUG_CLOSE() {pst_debug_close();} #define RET_DERROR(res, ret_val, x) if (res) { DIE(x);} #if BYTE_ORDER == BIG_ENDIAN # define LE64_CPU(x) \ x = ((((x) & UINT64_C(0xff00000000000000)) >> 56) | \ (((x) & UINT64_C(0x00ff000000000000)) >> 40) | \ (((x) & UINT64_C(0x0000ff0000000000)) >> 24) | \ (((x) & UINT64_C(0x000000ff00000000)) >> 8 ) | \ (((x) & UINT64_C(0x00000000ff000000)) << 8 ) | \ (((x) & UINT64_C(0x0000000000ff0000)) << 24) | \ (((x) & UINT64_C(0x000000000000ff00)) << 40) | \ (((x) & UINT64_C(0x00000000000000ff)) << 56)); # define LE32_CPU(x) \ x = ((((x) & 0xff000000) >> 24) | \ (((x) & 0x00ff0000) >> 8 ) | \ (((x) & 0x0000ff00) << 8 ) | \ (((x) & 0x000000ff) << 24)); # define LE16_CPU(x) \ x = ((((x) & 0xff00) >> 8) | \ (((x) & 0x00ff) << 8)); #elif BYTE_ORDER == LITTLE_ENDIAN # define LE64_CPU(x) {} # define LE32_CPU(x) {} # define LE16_CPU(x) {} #else # error "Byte order not supported by this library" #endif // BYTE_ORDER #define PST_LE_GET_UINT64(p) \ (uint64_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8) | \ (((uint8_t const *)(p))[2] << 16) | \ (((uint8_t const *)(p))[3] << 24) | \ (((uint8_t const *)(p))[4] << 32) | \ (((uint8_t const *)(p))[5] << 40) | \ (((uint8_t const *)(p))[6] << 48) | \ (((uint8_t const *)(p))[7] << 56)) #define PST_LE_GET_INT64(p) \ (int64_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8) | \ (((uint8_t const *)(p))[2] << 16) | \ (((uint8_t const *)(p))[3] << 24) | \ (((uint8_t const *)(p))[4] << 32) | \ (((uint8_t const *)(p))[5] << 40) | \ (((uint8_t const *)(p))[6] << 48) | \ (((uint8_t const *)(p))[7] << 56)) #define PST_LE_GET_UINT32(p) \ (uint32_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8) | \ (((uint8_t const *)(p))[2] << 16) | \ (((uint8_t const *)(p))[3] << 24)) #define PST_LE_GET_INT32(p) \ (int32_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8) | \ (((uint8_t const *)(p))[2] << 16) | \ (((uint8_t const *)(p))[3] << 24)) #define PST_LE_GET_UINT16(p) \ (uint16_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8)) #define PST_LE_GET_INT16(p) \ (int16_t)((((uint8_t const *)(p))[0] << 0) | \ (((uint8_t const *)(p))[1] << 8)) #define PST_LE_GET_UINT8(p) (*(uint8_t const *)(p)) #define PST_LE_GET_INT8(p) (*(int8_t const *)(p)) +#define MAXDATEFMTLEN 40 #endif //DEFINEH_H diff --git a/src/lspst.c b/src/lspst.c index e4652bf..f2886de 100644 --- a/src/lspst.c +++ b/src/lspst.c @@ -1,267 +1,312 @@ /*** * lspst.c * Part of the LibPST project * Author: Joe Nahmias <joe@nahmias.net> * Based on readpst.c by by David Smith <dave.s@earthcorp.com> * */ #include "define.h" struct file_ll { char *dname; int32_t stored_count; int32_t item_count; int32_t skip_count; int32_t type; }; +struct options { + int long_format; + char *date_format; +}; void canonicalize_filename(char *fname); void debug_print(char *fmt, ...); void usage(char *prog_name); void version(); // global settings pst_file pstfile; void create_enter_dir(struct file_ll* f, pst_item *item) { pst_convert_utf8(item, &item->file_as); f->item_count = 0; f->skip_count = 0; f->type = item->type; f->stored_count = (item->folder) ? item->folder->item_count : 0; f->dname = strdup(item->file_as.str); } void close_enter_dir(struct file_ll *f) { free(f->dname); } - -void process(pst_item *outeritem, pst_desc_tree *d_ptr) +void process(pst_item *outeritem, pst_desc_tree *d_ptr, struct options o) { struct file_ll ff; pst_item *item = NULL; char *result = NULL; size_t resultlen = 0; + size_t dateresultlen; DEBUG_ENT("process"); memset(&ff, 0, sizeof(ff)); create_enter_dir(&ff, outeritem); while (d_ptr) { if (!d_ptr->desc) { DEBUG_WARN(("ERROR item's desc record is NULL\n")); ff.skip_count++; } else { DEBUG_INFO(("Desc Email ID %"PRIx64" [d_ptr->d_id = %"PRIx64"]\n", d_ptr->desc->i_id, d_ptr->d_id)); item = pst_parse_item(&pstfile, d_ptr, NULL); DEBUG_INFO(("About to process item @ %p.\n", item)); if (item) { if (item->message_store) { // there should only be one message_store, and we have already done it DIE(("A second message_store has been found. Sorry, this must be an error.\n")); } if (item->folder && d_ptr->child) { // if this is a folder, we want to recurse into it pst_convert_utf8(item, &item->file_as); printf("Folder \"%s\"\n", item->file_as.str); - process(item, d_ptr->child); + process(item, d_ptr->child, o); } else if (item->contact && (item->type == PST_TYPE_CONTACT)) { if (!ff.type) ff.type = item->type; // Process Contact item if (ff.type != PST_TYPE_CONTACT) { DEBUG_INFO(("I have a contact, but the folder isn't a contacts folder. Processing anyway\n")); } printf("Contact"); if (item->contact->fullname.str) printf("\t%s", pst_rfc2426_escape(item->contact->fullname.str, &result, &resultlen)); printf("\n"); } else if (item->email && ((item->type == PST_TYPE_NOTE) || (item->type == PST_TYPE_SCHEDULE) || (item->type == PST_TYPE_REPORT))) { if (!ff.type) ff.type = item->type; // Process Email item if ((ff.type != PST_TYPE_NOTE) && (ff.type != PST_TYPE_SCHEDULE) && (ff.type != PST_TYPE_REPORT)) { DEBUG_INFO(("I have an email, but the folder isn't an email folder. Processing anyway\n")); } printf("Email"); + if (o.long_format == 1) { + if (item->email->arrival_date) { + char time_buffer[MAXDATEFMTLEN]; + dateresultlen = pst_fileTimeToString(item->email->arrival_date, o.date_format, time_buffer); + if (dateresultlen < 1) + DIE(("Date format error in -f option.\n")); + printf("\tDate: %s", time_buffer); + } + else + printf("\t"); + } if (item->email->outlook_sender_name.str) printf("\tFrom: %s", item->email->outlook_sender_name.str); + else + printf("\t"); + if (o.long_format == 1) { + if (item->email->outlook_recipient_name.str) + printf("\tTo: %s", item->email->outlook_recipient_name.str); + else + printf("\t"); + if (item->email->cc_address.str) + printf("\tCC: %s", item->email->cc_address.str); + else + printf("\t"); + if (item->email->bcc_address.str) + printf("\tBCC: %s", item->email->bcc_address.str); + else + printf("\t"); + } if (item->subject.str) printf("\tSubject: %s", item->subject.str); + else + printf("\t"); printf("\n"); } else if (item->journal && (item->type == PST_TYPE_JOURNAL)) { if (!ff.type) ff.type = item->type; // Process Journal item if (ff.type != PST_TYPE_JOURNAL) { DEBUG_INFO(("I have a journal entry, but folder isn't specified as a journal type. Processing...\n")); } if (item->subject.str) printf("Journal\t%s\n", pst_rfc2426_escape(item->subject.str, &result, &resultlen)); } else if (item->appointment && (item->type == PST_TYPE_APPOINTMENT)) { char time_buffer[30]; if (!ff.type) ff.type = item->type; // Process Calendar Appointment item DEBUG_INFO(("Processing Appointment Entry\n")); if (ff.type != PST_TYPE_APPOINTMENT) { DEBUG_INFO(("I have an appointment, but folder isn't specified as an appointment type. Processing...\n")); } printf("Appointment"); if (item->subject.str) printf("\tSUMMARY: %s", pst_rfc2426_escape(item->subject.str, &result, &resultlen)); if (item->appointment->start) printf("\tSTART: %s", pst_rfc2445_datetime_format(item->appointment->start, sizeof(time_buffer), time_buffer)); if (item->appointment->end) printf("\tEND: %s", pst_rfc2445_datetime_format(item->appointment->end, sizeof(time_buffer), time_buffer)); printf("\tALL DAY: %s", (item->appointment->all_day==1 ? "Yes" : "No")); printf("\n"); } else { ff.skip_count++; DEBUG_INFO(("Unknown item type. %i. Ascii1=\"%s\"\n", item->type, item->ascii_type)); } pst_freeItem(item); } else { ff.skip_count++; DEBUG_INFO(("A NULL item was seen\n")); } } d_ptr = d_ptr->next; } close_enter_dir(&ff); if (result) free(result); DEBUG_RET(); } void usage(char *prog_name) { DEBUG_ENT("usage"); version(); printf("Usage: %s [OPTIONS] {PST FILENAME}\n", prog_name); printf("OPTIONS:\n"); printf("\t-d <filename> \t- Debug to file. This is a binary log. Use readlog to print it\n"); + printf("\t-l\t- Print the date, CC and BCC fields of emails too (by default only the From and Subject)\n"); + printf("\t-f <date_format> \t- Select the date format in ctime format (by default \"%%F %%T\")\n"); printf("\t-h\t- Help. This screen\n"); printf("\t-V\t- Version. Display program version\n"); DEBUG_RET(); } void version() { DEBUG_ENT("version"); printf("lspst / LibPST v%s\n", VERSION); #if BYTE_ORDER == BIG_ENDIAN printf("Big Endian implementation being used.\n"); #elif BYTE_ORDER == LITTLE_ENDIAN printf("Little Endian implementation being used.\n"); #else # error "Byte order not supported by this library" #endif DEBUG_RET(); } int main(int argc, char* const* argv) { pst_item *item = NULL; pst_desc_tree *d_ptr; char *temp = NULL; //temporary char pointer int c; char *d_log = NULL; + struct options o; + o.long_format = 0; + char *defaultfmtdate = "%F %T"; + o.date_format = defaultfmtdate; - while ((c = getopt(argc, argv, "d:hV"))!= -1) { + while ((c = getopt(argc, argv, "d:f:lhV"))!= -1) { switch (c) { case 'd': d_log = optarg; break; + case 'f': + o.date_format = optarg; + break; + case 'l': + o.long_format = 1; + break; case 'h': usage(argv[0]); exit(0); break; case 'V': version(); exit(0); break; default: usage(argv[0]); exit(1); break; } } #ifdef DEBUG_ALL // force a log file if (!d_log) d_log = "lspst.log"; #endif // defined DEBUG_ALL DEBUG_INIT(d_log, NULL); DEBUG_ENT("main"); if (argc <= optind) { usage(argv[0]); exit(2); } // Open PST file if (pst_open(&pstfile, argv[optind], NULL)) DIE(("Error opening File\n")); // Load PST index if (pst_load_index(&pstfile)) DIE(("Index Error\n")); pst_load_extended_attributes(&pstfile); d_ptr = pstfile.d_head; // first record is main record item = pst_parse_item(&pstfile, d_ptr, NULL); if (!item || !item->message_store) { DEBUG_RET(); DIE(("Could not get root record\n")); } // default the file_as to the same as the main filename if it doesn't exist if (!item->file_as.str) { if (!(temp = strrchr(argv[1], '/'))) if (!(temp = strrchr(argv[1], '\\'))) temp = argv[1]; else temp++; // get past the "\\" else temp++; // get past the "/" item->file_as.str = strdup(temp); item->file_as.is_utf8 = 1; } d_ptr = pst_getTopOfFolders(&pstfile, item); if (!d_ptr) DIE(("Top of folders record not found. Cannot continue\n")); - process(item, d_ptr->child); // do the childred of TOPF + process(item, d_ptr->child, o); // do the childred of TOPF pst_freeItem(item); pst_close(&pstfile); DEBUG_RET(); return 0; } // This function will make sure that a filename is in cannonical form. That // is, it will replace any slashes, backslashes, or colons with underscores. void canonicalize_filename(char *fname) { DEBUG_ENT("canonicalize_filename"); if (fname == NULL) { DEBUG_RET(); return; } while ((fname = strpbrk(fname, "/\\:"))) *fname = '_'; DEBUG_RET(); } diff --git a/src/timeconv.c b/src/timeconv.c index 2e34045..69c1b48 100644 --- a/src/timeconv.c +++ b/src/timeconv.c @@ -1,29 +1,34 @@ #include "define.h" char* pst_fileTimeToAscii(const FILETIME* filetime, char* result) { time_t t; t = pst_fileTimeToUnixTime(filetime); return ctime_r(&t, result); } +size_t pst_fileTimeToString(const FILETIME* filetime, const char* date_format, char* result) { + time_t t; + t = pst_fileTimeToUnixTime(filetime); + return strftime(result, MAXDATEFMTLEN-1, date_format, localtime(&t)); +} void pst_fileTimeToStructTM (const FILETIME *filetime, struct tm *result) { time_t t1; t1 = pst_fileTimeToUnixTime(filetime); gmtime_r(&t1, result); } time_t pst_fileTimeToUnixTime(const FILETIME *filetime) { uint64_t t = filetime->dwHighDateTime; const uint64_t bias = 11644473600LL; t <<= 32; t += filetime->dwLowDateTime; t /= 10000000; t -= bias; return ((t > (uint64_t)0x000000007fffffff) && (sizeof(time_t) <= 4)) ? 0 : (time_t)t; } diff --git a/src/timeconv.h b/src/timeconv.h index 2361e56..1b4d577 100644 --- a/src/timeconv.h +++ b/src/timeconv.h @@ -1,31 +1,39 @@ #ifndef __PST_TIMECONV_H #define __PST_TIMECONV_H #include "common.h" #ifdef __cplusplus extern "C" { #endif /** Convert a FILETIME to ascii printable local time. @param[in] filetime time structure to be converted @param[out] result pointer to output buffer, must be at least 30 bytes. @return result pointer to the output buffer */ char* pst_fileTimeToAscii (const FILETIME* filetime, char* result); /** Convert a FILETIME to unix struct tm. @param[in] filetime time structure to be converted @param[out] result pointer to output struct tm */ void pst_fileTimeToStructTM (const FILETIME* filetime, struct tm *result); /** Convert a FILETIME to unix time_t value. @param[in] filetime time structure to be converted @return result time_t value */ time_t pst_fileTimeToUnixTime( const FILETIME* filetime); + + /** Convert a FILETIME to string in date_format format. + @param[in] filetime time structure to be converted + @param[in] string ctime_r format of output date + @param[out] result pointer to output buffer, must be at least 30 bytes. + @return result size_t value returned by strftime + */ + size_t pst_fileTimeToString( const FILETIME* filetime, const char* date_format, char* result); #ifdef __cplusplus } #endif #endif diff --git a/xml/libpst.in b/xml/libpst.in index 25b374d..c5787ee 100644 --- a/xml/libpst.in +++ b/xml/libpst.in @@ -1,2041 +1,2055 @@ <reference> <title>@PACKAGE@ Utilities - Version @VERSION@</title> <partintro> <title>Packages</title> <para>The various source and binary packages are available at <ulink url="http://www.five-ten-sg.com/@PACKAGE@/packages/">http://www.five-ten-sg.com/@PACKAGE@/packages/</ulink>. The most recent documentation is available at <ulink url="http://www.five-ten-sg.com/@PACKAGE@/">http://www.five-ten-sg.com/@PACKAGE@/</ulink>. The most recent developer documentation for the shared library is available at <ulink url="http://www.five-ten-sg.com/@PACKAGE@/devel/">http://www.five-ten-sg.com/@PACKAGE@/devel/</ulink>. </para> <para>A <ulink url="http://www.selenic.com/mercurial/wiki/">Mercurial</ulink> source code repository for this project is available at <ulink url="http://hg.five-ten-sg.com/@PACKAGE@/">http://hg.five-ten-sg.com/@PACKAGE@/</ulink>. </para> <para>This version can now convert both 32 bit Outlook files (pre 2003), and the 64 bit Outlook 2003 pst files. Utilities are supplied to convert email messages to both mbox and MH mailbox formats, and to DII load file format for use with many of the <ulink url="http://www.ctsummation.com">CT Summation</ulink> products. Contacts can be converted to a simple list, to vcard format, or to ldif format for import to an LDAP server. </para> <para>The <ulink url="http://code.google.com/p/libpff/">libpff</ulink> project has some excellent documentation of the pst file format. </para> </partintro> <refentry id="readpst.1"> <refentryinfo> - <date>2016-08-29</date> + <date>2017-12-07</date> </refentryinfo> <refmeta> <refentrytitle>readpst</refentrytitle> <manvolnum>1</manvolnum> <refmiscinfo>readpst @VERSION@</refmiscinfo> </refmeta> <refnamediv id='readpst.name.1'> <refname>readpst</refname> <refpurpose>convert PST (MS Outlook Personal Folders) files to mbox and other formats</refpurpose> </refnamediv> <refsynopsisdiv id='readpst.synopsis.1'> <title>Synopsis</title> <cmdsynopsis> <command>readpst</command> <arg><option>-C <replaceable class="parameter">default-charset</replaceable></option></arg> <arg><option>-D</option></arg> <arg><option>-M</option></arg> <arg><option>-S</option></arg> <arg><option>-V</option></arg> <arg><option>-a <replaceable class="parameter">attachment-extension-list</replaceable></option></arg> <arg><option>-b</option></arg> <arg><option>-c <replaceable class="parameter">format</replaceable></option></arg> <arg><option>-d <replaceable class="parameter">debug-file</replaceable></option></arg> <arg><option>-e</option></arg> <arg><option>-h</option></arg> <arg><option>-j <replaceable class="parameter">jobs</replaceable></option></arg> <arg><option>-k</option></arg> <arg><option>-m</option></arg> <arg><option>-o <replaceable class="parameter">output-directory</replaceable></option></arg> <arg><option>-q</option></arg> <arg><option>-r</option></arg> <arg><option>-t <replaceable class="parameter">output-type-codes</replaceable></option></arg> <arg><option>-u</option></arg> <arg><option>-w</option></arg> <arg><option>-8</option></arg> <arg choice='plain'>pstfile</arg> </cmdsynopsis> </refsynopsisdiv> <refsect1 id='readpst.description.1'> <title>Description</title> <para><command>readpst</command> is a program that can read an Outlook PST (Personal Folders) file and convert it into an mbox file, a format suitable for KMail, a recursive mbox structure, or separate emails. </para> </refsect1> <refsect1 id='readpst.options.1'> <title>Options</title> <variablelist> <varlistentry> <term>-C <replaceable class="parameter">default-charset</replaceable></term> <listitem><para> Set the character set to be used for items with an unspecified character set. </para></listitem> </varlistentry> <varlistentry> <term>-D</term> <listitem><para> Include deleted items in the output. </para></listitem> </varlistentry> <varlistentry> <term>-M</term> <listitem><para> Output messages in MH (rfc822) format as separate files. This will create folders as named in the PST file, and will put each email together with any attachments into its own file. These files will be numbered from 1 to n with no leading zeros. This format has no from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-S</term> <listitem><para> Output messages into separate files. This will create folders as named in the PST file, and will put each email in its own file. These files will be numbered from 1 to n with no leading zeros. Attachments will also be saved in the same folder as the email message. The attachments for message $m are saved as $m-$name where $name is (the original name of the attachment, or 'attach$n' if the attachment had no name), where $n is another sequential index with no leading zeros. This format has no from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-V</term> <listitem><para> Show program version and exit. </para></listitem> </varlistentry> <varlistentry> <term>-a <replaceable class="parameter">attachment-extension-list</replaceable></term> <listitem><para> Set the list of acceptable attachment extensions. Any attachment that does not have an extension on this list will be discarded. All attachments are acceptable if the list is empty, or this option is not specified. </para></listitem> </varlistentry> <varlistentry> <term>-b</term> <listitem><para> Do not save the attachments for the RTF format of the email body. </para></listitem> </varlistentry> <varlistentry> <term>-c <replaceable class="parameter">format</replaceable></term> <listitem><para> Set the Contact output mode. Use -cv for vcard format or -cl for an email list. </para></listitem> </varlistentry> <varlistentry> <term>-d <replaceable class="parameter">debug-file</replaceable></term> <listitem><para> Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. </para></listitem> </varlistentry> <varlistentry> <term>-e</term> <listitem><para> Same as the M option, but each output file will include an extension from (.eml, .ics, .vcf). This format has no from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-h</term> <listitem><para> Show summary of options and exit. </para></listitem> </varlistentry> <varlistentry> <term>-j <replaceable class="parameter">jobs</replaceable></term> <listitem><para> Specifies the maximum number of parallel jobs. Specify 0 to suppress running parallel jobs. Folders may be processed in parallel. Output formats that place each mail message in a separate file (-M, -S, -e) may process the contents of individual folders in parallel. </para></listitem> </varlistentry> <varlistentry> <term>-k</term> <listitem><para> Changes the output format to KMail. This format uses mboxrd from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-m</term> <listitem><para> Same as the e option, but write .msg files also </para></listitem> </varlistentry> <varlistentry> <term>-o <replaceable class="parameter">output-directory</replaceable></term> <listitem><para> Specifies the output directory. The directory must already exist, and is entered after the PST file is opened, but before any processing of files commences. </para></listitem> </varlistentry> <varlistentry> <term>-q</term> <listitem><para> Changes to silent mode. No feedback is printed to the screen, except for error messages. </para></listitem> </varlistentry> <varlistentry> <term>-r</term> <listitem><para> Changes the output format to Recursive. This will create folders as named in the PST file, and will put all emails in a file called "mbox" inside each folder. Appointments go into a file called "calendar", address book entries go into a file called "contacts", and journal entries go into a file called "journal". These files are then compatible with all mbox-compatible email clients. This format uses mboxrd from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-t <replaceable class="parameter">output-type-codes</replaceable></term> <listitem><para> Specifies the item types that are processed. The argument is a sequence of single letters from (e,a,j,c) for (email, appointment, journal, contact) types. The default is to process all item types. </para></listitem> </varlistentry> <varlistentry> <term>-u</term> <listitem><para> Sets Thunderbird mode, a submode of recursive mode. This causes two extra .type and .size meta files to be created. This format uses mboxrd from quoting. </para></listitem> </varlistentry> <varlistentry> <term>-w</term> <listitem><para> Overwrite any previous output files. Beware: When used with the -S switch, this will remove all files from the target folder before writing. This is to keep the count of emails and attachments correct. </para></listitem> </varlistentry> <varlistentry> <term>-8</term> <listitem><para> Output bodies in UTF-8, rather than original encoding, if a UTF-8 version is available. </para></listitem> </varlistentry> </variablelist> </refsect1> <refsect1 id='readpst.quoting.1'> <title>From Quoting</title> <para> Output formats that place each mail message in a separate file (-M, -S, -e, -m) don't do any from quoting. Output formats that place multiple email messages in a single file (-k, -r, -u) now use mboxrd from quoting rules. If none of those switches are specified, the default output format uses mboxrd from quoting rules, since it produces multiple email messages in a single file. Earlier versions used mboxo from quoting rules for all output formats. </para> </refsect1> <refsect1 id='readpst.author.1'> <title>Author</title> <para> This manual page was originally written by Dave Smith <dave.s@earthcorp.com>, and updated by Joe Nahmias <joe@nahmias.net> for the Debian GNU/Linux system (but may be used by others). It was subsequently updated by Brad Hards <bradh@frogmouth.net>, and converted to xml format by Carl Byington <carl@five-ten-sg.com>. </para> </refsect1> <refsect1 id='readpst.copyright.1'> <title>Copyright</title> <para> Copyright (C) 2002 by David Smith <dave.s@earthcorp.com>. XML version Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com>. </para> <para> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. </para> <para> You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. </para> </refsect1> <refsect1 id='readpst.version.1'> <title>Version</title> <para> @VERSION@ </para> </refsect1> </refentry> <refentry id="lspst.1"> <refentryinfo> <date>2016-08-29</date> </refentryinfo> <refmeta> <refentrytitle>lspst</refentrytitle> <manvolnum>1</manvolnum> <refmiscinfo>lspst @VERSION@</refmiscinfo> </refmeta> <refnamediv id='lspst.name.1'> <refname>lspst</refname> <refpurpose>list PST (MS Outlook Personal Folders) file data</refpurpose> </refnamediv> <refsynopsisdiv id='lspst.synopsis.1'> <title>Synopsis</title> <cmdsynopsis> <command>lspst</command> <arg><option>-V</option></arg> <arg><option>-d <replaceable class="parameter">debug-file</replaceable></option></arg> + <arg><option>-f <replaceable class="parameter">date-format</replaceable></option></arg> + <arg><option>-l</option></arg> <arg><option>-h</option></arg> <arg choice='plain'>pstfile</arg> </cmdsynopsis> </refsynopsisdiv> <refsect1 id='lspst.options.1'> <title>Options</title> <variablelist> <varlistentry> <term>-V</term> <listitem><para> Show program version and exit. </para></listitem> </varlistentry> <varlistentry> <term>-d <replaceable class="parameter">debug-file</replaceable></term> <listitem><para> Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. </para></listitem> </varlistentry> + <varlistentry> + <term>-f <replaceable class="parameter">date-format</replaceable></term> + <listitem><para> + Select the date format for long format listing. Defaults to "%F %T". + </para></listitem> + </varlistentry> + <varlistentry> + <term>-l</term> + <listitem><para> + Use long format listing to show the Date, CC and BCC headers. + </para></listitem> + </varlistentry> <varlistentry> <term>-h</term> <listitem><para> Show summary of options and exit. </para></listitem> </varlistentry> </variablelist> </refsect1> <refsect1 id='lspst.description.1'> <title>Description</title> <para><command>lspst</command> is a program that can read an Outlook PST (Personal Folders) file and produce a simple listing of the data (contacts, email subjects, etc). </para> </refsect1> <refsect1 id='lspst.author.1'> <title>Author</title> <para> lspst was written by Joe Nahmias <joe@nahmias.net> based on readpst. This man page was written by 510 Software Group <carl@five-ten-sg.com>. </para> </refsect1> <refsect1 id='lspst.copyright.1'> <title>Copyright</title> <para> Copyright (C) 2004 by Joe Nahmias <joe@nahmias.net>. </para> <para> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. </para> <para> You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. </para> </refsect1> <refsect1 id='lspst.version.1'> <title>Version</title> <para> @VERSION@ </para> </refsect1> </refentry> <refentry id="pst2ldif.1"> <refentryinfo> - <date>2016-08-29</date> + <date>2017-12-07</date> </refentryinfo> <refmeta> <refentrytitle>pst2ldif</refentrytitle> <manvolnum>1</manvolnum> <refmiscinfo>pst2ldif @VERSION@</refmiscinfo> </refmeta> <refnamediv id='pst2ldif.name.1'> <refname>pst2ldif</refname> <refpurpose>extract contacts from a MS Outlook .pst file in .ldif format</refpurpose> </refnamediv> <refsynopsisdiv id='pst2ldif.synopsis.1'> <title>Synopsis</title> <cmdsynopsis> <command>pst2ldif</command> <arg><option>-V</option></arg> <arg><option>-b <replaceable class="parameter">ldap-base</replaceable></option></arg> <arg><option>-c <replaceable class="parameter">class</replaceable></option></arg> <arg><option>-d <replaceable class="parameter">debug-file</replaceable></option></arg> <arg><option>-l <replaceable class="parameter">extra-line</replaceable></option></arg> <arg><option>-o</option></arg> <arg><option>-h</option></arg> <arg choice='plain'>pstfilename</arg> </cmdsynopsis> </refsynopsisdiv> <refsect1 id='pst2ldif.options.1'> <title>Options</title> <variablelist> <varlistentry> <term>-V</term> <listitem><para> Show program version. Subsequent options are then ignored. </para></listitem> </varlistentry> <varlistentry> <term>-b <replaceable class="parameter">ldap-base</replaceable></term> <listitem><para> Sets the ldap base value used in the dn records. You probably want to use something like "o=organization, c=US". </para></listitem> </varlistentry> <varlistentry> <term>-c <replaceable class="parameter">class</replaceable></term> <listitem><para> Sets the objectClass values for the contact items. This class needs to be defined in the schema used by your LDAP server, and at a minimum it must contain the ldap attributes given below. This option may be specified multiple times to generate entries with multiple object classes. </para></listitem> </varlistentry> <varlistentry> <term>-d <replaceable class="parameter">debug-file</replaceable></term> <listitem><para> Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. </para></listitem> </varlistentry> <varlistentry> <term>-l <replaceable class="parameter">extra-line</replaceable></term> <listitem><para> Specify an extra line to be added to each ldap entry. This option may be specified multiple times to add multiple lines to each ldap entry. </para></listitem> </varlistentry> <varlistentry> <term>-o</term> <listitem><para> Use the old ldap schema, rather than the default new ldap schema. The old schema generates multiple postalAddress attributes for a single entry. The new schema generates a single postalAddress (and homePostalAddress when available) attribute with $ delimiters as specified in RFC4517. Using the old schema also generates two extra leading entries, one for "dn:ldap base", and one for "dn: cn=root, ldap base". </para></listitem> </varlistentry> <varlistentry> <term>-h</term> <listitem><para> Show summary of options. Subsequent options are then ignored. </para></listitem> </varlistentry> </variablelist> </refsect1> <refsect1 id='pst2ldif.description.1'> <title>Description</title> <para><command>pst2ldif</command> reads the contact information from a MS Outlook .pst file and produces a .ldif file that may be used to import those contacts into an LDAP database. The following ldap attributes are generated for the old ldap schema: <simplelist> <member>cn </member> <member>givenName </member> <member>sn </member> <member>personalTitle </member> <member>company </member> <member>mail </member> <member>postalAddress </member> <member>l </member> <member>st </member> <member>postalCode </member> <member>c </member> <member>homePhone </member> <member>telephoneNumber </member> <member>facsimileTelephoneNumber </member> <member>mobile </member> <member>description </member> </simplelist> The following attributes are generated for the new ldap schema: <simplelist> <member>cn </member> <member>givenName </member> <member>sn </member> <member>title </member> <member>o </member> <member>mail </member> <member>postalAddress </member> <member>homePostalAddress </member> <member>l </member> <member>st </member> <member>postalCode </member> <member>c </member> <member>homePhone </member> <member>telephoneNumber </member> <member>facsimileTelephoneNumber </member> <member>mobile </member> <member>description </member> <member>labeledURI </member> </simplelist> </para> </refsect1> <refsect1 id='pst2ldif.copyright.1'> <title>Copyright</title> <para> Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com> </para> <para> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. </para> <para> You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. </para> </refsect1> <refsect1 id='pst2ldif.version.1'> <title>Version</title> <para> @VERSION@ </para> </refsect1> </refentry> <refentry id="pst2dii.1"> <refentryinfo> - <date>2016-08-29</date> + <date>2017-12-07</date> </refentryinfo> <refmeta> <refentrytitle>pst2dii</refentrytitle> <manvolnum>1</manvolnum> <refmiscinfo>pst2dii @VERSION@</refmiscinfo> </refmeta> <refnamediv id='pst2dii.name.1'> <refname>pst2dii</refname> <refpurpose>extract email messages from a MS Outlook .pst file in DII load format</refpurpose> </refnamediv> <refsynopsisdiv id='pst2dii.synopsis.1'> <title>Synopsis</title> <cmdsynopsis> <command>pst2dii</command> <arg><option>-B <replaceable class="parameter">bates-prefix</replaceable></option></arg> <arg><option>-O <replaceable class="parameter">dii-output-file</replaceable></option></arg> <arg><option>-V</option></arg> <arg><option>-b <replaceable class="parameter">bates-number</replaceable></option></arg> <arg><option>-c <replaceable class="parameter">bates-color</replaceable></option></arg> <arg><option>-d <replaceable class="parameter">debug-file</replaceable></option></arg> <arg choice='plain'>-f <replaceable class="parameter">ttf-font-file</replaceable></arg> <arg><option>-h</option></arg> <arg><option>-o <replaceable class="parameter">output-directory</replaceable></option></arg> <arg choice='plain'>pstfilename</arg> </cmdsynopsis> </refsynopsisdiv> <refsect1 id='pst2dii.options.1'> <title>Options</title> <variablelist> <varlistentry> <term>-B <replaceable class="parameter">bates-prefix</replaceable></term> <listitem><para> Sets the bates prefix string. The bates sequence number is appended to this string, and printed on each page. </para></listitem> </varlistentry> <varlistentry> <term>-O <replaceable class="parameter">dii-output-file</replaceable></term> <listitem><para> Name of the output DII load file. </para></listitem> </varlistentry> <varlistentry> <term>-V</term> <listitem><para> Show program version. Subsequent options are then ignored. </para></listitem> </varlistentry> <varlistentry> <term>-b <replaceable class="parameter">bates-number</replaceable></term> <listitem><para> Starting bates sequence number. The default is zero. </para></listitem> </varlistentry> <varlistentry> <term>-c <replaceable class="parameter">bates-color</replaceable></term> <listitem><para> Font color for the bates stamp on each page, specified as 6 hex digits as rrggbb values. The default is ff0000 for bright red. </para></listitem> </varlistentry> <varlistentry> <term>-d <replaceable class="parameter">debug-file</replaceable></term> <listitem><para> Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. </para></listitem> </varlistentry> <varlistentry> <term>-f <replaceable class="parameter">ttf-font-file</replaceable></term> <listitem><para> Specify name of a true type font file. This should be a fixed pitch font. </para></listitem> </varlistentry> <varlistentry> <term>-h</term> <listitem><para> Show summary of options. Subsequent options are then ignored. </para></listitem> </varlistentry> <varlistentry> <term>-o <replaceable class="parameter">output-directory</replaceable></term> <listitem><para> Specifies the output directory. The directory must already exist. </para></listitem> </varlistentry> </variablelist> </refsect1> <refsect1 id='pst2dii.description.1'> <title>Description</title> <para><command>pst2dii</command> reads the email messages from a MS Outlook .pst file and produces a DII load file that may be used to import message summaries into a Summation DII system. The DII output file contains references to the image and attachment files in the output directory. </para> </refsect1> <refsect1 id='pst2dii.copyright.1'> <title>Copyright</title> <para> Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com> </para> <para> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. </para> <para> You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. </para> </refsect1> <refsect1 id='pst2dii.version.1'> <title>Version</title> <para> @VERSION@ </para> </refsect1> </refentry> <refentry id="pst.5"> <refentryinfo> - <date>2016-08-29</date> + <date>2017-12-07</date> </refentryinfo> <refmeta> <refentrytitle>outlook.pst</refentrytitle> <manvolnum>5</manvolnum> </refmeta> <refnamediv id='pst.name.1'> <refname>outlook.pst</refname> <refpurpose>format of MS Outlook .pst file</refpurpose> </refnamediv> <refsynopsisdiv id='pst.synopsis.1'> <title>Synopsis</title> <cmdsynopsis> <command>outlook.pst</command> </cmdsynopsis> </refsynopsisdiv> <refsect1 id='pst.file.overview.5'> <title>Overview</title> <para> Low level or primitive items in a .pst file are identified by an I_ID value. Higher level or composite items in a .pst file are identified by a D_ID value. There are two separate b-trees indexed by these I_ID and D_ID values. Starting with Outlook 2003, the file format changed from one with 32 bit pointers, to one with 64 bit pointers. We describe both formats here. </para> </refsect1> <refsect1 id='pst.file.header.32.5'> <title>32 bit File Header</title> <para> The 32 bit file header is located at offset 0 in the .pst file. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 42 44 4e 49 f8 64 d9 53 4d 0e 00 13 00 01 01 0010 00 00 00 00 00 00 00 00 50 d6 03 00 bd 1e 02 00 0020 08 4c 00 00 00 04 00 00 00 04 00 00 0f 04 00 00 0030 0d 40 00 00 99 0a 01 00 18 04 00 00 0d 40 00 00 0040 0d 40 00 00 11 80 00 00 02 04 00 00 0a 04 00 00 0050 00 04 00 00 00 04 00 00 0f 04 00 00 0f 04 00 00 0060 0f 04 00 00 0d 40 00 00 00 04 00 00 00 04 00 00 0070 04 40 00 00 00 04 00 00 00 04 00 00 00 04 00 00 0080 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 0090 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 00a0 0c 09 00 00 00 00 00 00 00 04 27 00 00 24 23 00 00b0 c0 09 0a 00 00 c8 00 00 bc 1e 02 00 00 7e 0c 00 00c0 b4 1e 02 00 00 54 00 00 01 00 00 00 23 55 44 d1 00d0 5a 4f ce 6b 80 ff ff ff 00 00 00 00 00 00 00 00 00e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0140 00 00 00 00 00 00 00 00 00 00 00 00 3f ff ff ff 0150 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0160 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0170 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0180 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0190 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01c0 ff ff ff ff ff ff ff ff ff ff ff ff 80 01 00 00 01d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000 signature [4 bytes] 0x4e444221 constant 000a indexType [1 byte] 0x0e constant 01cd encryptionType [1 byte] 0x01 in this case 00a8 total file size [4 bytes] 0x270400 in this case 00c0 backPointer1 [4 bytes] 0x021eb4 in this case 00c4 offsetIndex1 [4 bytes] 0x005400 in this case 00b8 backPointer2 [4 bytes] 0x021ebc in this case 00bc offsetIndex2 [4 bytes] 0x0c7e00 in this case ]]></literallayout> <para> We only support index types 0x0e, 0x0f, 0x15, and 0x17, and encryption types 0x00, 0x01 and 0x02. Index type 0x0e is the older 32 bit Outlook format. Index type 0x0f seems to be rare, and so far the data seems to be identical to that in type 0x0e files. Index type 0x17 is the newer 64 bit Outlook format. Index type 0x15 seems to be rare, and according to the libpff project should have the same format as type 0x17 files. It was found in a 64-bit pst file created by Visual Recovery. It may be that index types less than 0x10 are 32 bit, and index types greater than or equal to 0x10 are 64 bit, and the low order four bits of the index type is some subtype or minor version number. </para> <para> Encryption type 0x00 is no encryption, type 0x01 is "compressible" encryption which is a simple substitution cipher, and type 0x02 is "strong" encryption, which is a simple three rotor Enigma cipher from WWII. </para> <para> offsetIndex1 is the file offset of the root of the index1 b-tree, which contains (I_ID, offset, size, unknown) tuples for each item in the file. backPointer1 is the value that should appear in the parent pointer of that root node. </para> <para> offsetIndex2 is the file offset of the root of the index2 b-tree, which contains (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) tuples for each item in the file. backPointer2 is the value that should appear in the parent pointer of that root node. </para> </refsect1> <refsect1 id='pst.file.header.64.5'> <title>64 bit File Header</title> <para> The 64 bit file header is located at offset 0 in the .pst file. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 42 44 4e 03 02 23 b2 53 4d 17 00 13 00 01 01 0010 00 00 00 00 00 00 00 00 04 00 00 00 01 00 00 00 0020 8b 00 00 00 00 00 00 00 1d 00 00 00 00 04 00 00 0030 00 04 00 00 04 04 00 00 00 40 00 00 02 00 01 00 0040 00 04 00 00 00 04 00 00 00 04 00 00 00 80 00 00 0050 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 0060 04 04 00 00 04 04 00 00 04 04 00 00 00 04 00 00 0070 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 0080 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 0090 00 04 00 00 00 04 00 00 00 04 00 00 00 04 00 00 00a0 00 04 00 00 00 04 00 00 02 04 00 00 00 00 00 00 00b0 00 00 00 00 00 00 00 00 00 24 04 00 00 00 00 00 00c0 00 44 00 00 00 00 00 00 00 71 03 00 00 00 00 00 00d0 00 22 00 00 00 00 00 00 83 00 00 00 00 00 00 00 00e0 00 6a 00 00 00 00 00 00 8a 00 00 00 00 00 00 00 00f0 00 60 00 00 00 00 00 00 01 00 00 00 00 00 00 00 0100 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0180 7f ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0190 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 01f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0200 80 00 00 00 e8 00 00 00 00 00 00 00 c4 68 cb 89 0000 signature [4 bytes] 0x4e444221 constant 000a indexType [1 byte] 0x17 constant 0201 encryptionType [1 byte] 0x00 in this case 00b8 total file size [8 bytes] 0x042400 in this case 00e8 backPointer1 [8 bytes] 0x00008a in this case 00f0 offsetIndex1 [8 bytes] 0x006000 in this case 00d8 backPointer2 [8 bytes] 0x000083 in this case 00e0 offsetIndex2 [8 bytes] 0x006a00 in this case ]]></literallayout> </refsect1> <refsect1 id='pst.file.node1.32.5'> <title>32 bit Index 1 Node</title> <para> The 32 bit index1 b-tree nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 04 00 00 00 8a 1e 02 00 00 1c 0b 00 000c 58 27 03 00 b3 1e 02 00 00 52 00 00 0018 00 00 00 00 00 00 00 00 00 00 00 00 0024 00 00 00 00 00 00 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 00 00 00 00 003c 00 00 00 00 00 00 00 00 00 00 00 00 0048 00 00 00 00 00 00 00 00 00 00 00 00 0054 00 00 00 00 00 00 00 00 00 00 00 00 0060 00 00 00 00 00 00 00 00 00 00 00 00 006c 00 00 00 00 00 00 00 00 00 00 00 00 0078 00 00 00 00 00 00 00 00 00 00 00 00 0084 00 00 00 00 00 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 009c 00 00 00 00 00 00 00 00 00 00 00 00 00a8 00 00 00 00 00 00 00 00 00 00 00 00 00b4 00 00 00 00 00 00 00 00 00 00 00 00 00c0 00 00 00 00 00 00 00 00 00 00 00 00 00cc 00 00 00 00 00 00 00 00 00 00 00 00 00d8 00 00 00 00 00 00 00 00 00 00 00 00 00e4 00 00 00 00 00 00 00 00 00 00 00 00 00f0 00 00 00 00 00 00 00 00 00 00 00 00 00fc 00 00 00 00 00 00 00 00 00 00 00 00 0108 00 00 00 00 00 00 00 00 00 00 00 00 0114 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 012c 00 00 00 00 00 00 00 00 00 00 00 00 0138 00 00 00 00 00 00 00 00 00 00 00 00 0144 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 015c 00 00 00 00 00 00 00 00 00 00 00 00 0168 00 00 00 00 00 00 00 00 00 00 00 00 0174 00 00 00 00 00 00 00 00 00 00 00 00 0180 00 00 00 00 00 00 00 00 00 00 00 00 018c 00 00 00 00 00 00 00 00 00 00 00 00 0198 00 00 00 00 00 00 00 00 00 00 00 00 01a4 00 00 00 00 00 00 00 00 00 00 00 00 01b0 00 00 00 00 00 00 00 00 00 00 00 00 01bc 00 00 00 00 00 00 00 00 00 00 00 00 01c8 00 00 00 00 00 00 00 00 00 00 00 00 01d4 00 00 00 00 00 00 00 00 00 00 00 00 01e0 00 00 00 00 00 00 00 00 00 00 00 00 01ec 00 00 00 00 02 29 0c 02 80 80 b6 4a 01f8 b4 1e 02 00 27 9c cc 56 01f0 itemCount [1 byte] 0x02 in this case 01f1 maxItemCount [1 byte] 0x29 constant 01f2 itemSize [1 byte] 0x0c constant 01f3 nodeLevel [1 byte] 0x02 in this case 01f8 backPointer [4 bytes] 0x021eb4 in this case ]]></literallayout> <para> The itemCount specifies the number of 12 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a triple of (I_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and I_ID is the lowest I_ID value in the subtree. </para> </refsect1> <refsect1 id='pst.file.node1.64.5'> <title>64 bit Index 1 Node</title> <para> The 64 bit index1 b-tree nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 04 00 00 00 00 00 00 00 88 00 00 00 000C 00 00 00 00 00 48 00 00 00 00 00 00 0018 74 00 00 00 00 00 00 00 86 00 00 00 0024 00 00 00 00 00 54 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 00 00 00 00 003C 00 00 00 00 00 00 00 00 00 00 00 00 0048 00 00 00 00 00 00 00 00 00 00 00 00 0054 00 00 00 00 00 00 00 00 00 00 00 00 0060 00 00 00 00 00 00 00 00 00 00 00 00 006C 00 00 00 00 00 00 00 00 00 00 00 00 0078 00 00 00 00 00 00 00 00 00 00 00 00 0084 00 00 00 00 00 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 009C 00 00 00 00 00 00 00 00 00 00 00 00 00A8 00 00 00 00 00 00 00 00 00 00 00 00 00B4 00 00 00 00 00 00 00 00 00 00 00 00 00C0 00 00 00 00 00 00 00 00 00 00 00 00 00CC 00 00 00 00 00 00 00 00 00 00 00 00 00D8 00 00 00 00 00 00 00 00 00 00 00 00 00E4 00 00 00 00 00 00 00 00 00 00 00 00 00F0 00 00 00 00 00 00 00 00 00 00 00 00 00FC 00 00 00 00 00 00 00 00 00 00 00 00 0108 00 00 00 00 00 00 00 00 00 00 00 00 0114 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 012C 00 00 00 00 00 00 00 00 00 00 00 00 0138 00 00 00 00 00 00 00 00 00 00 00 00 0144 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 015C 00 00 00 00 00 00 00 00 00 00 00 00 0168 00 00 00 00 00 00 00 00 00 00 00 00 0174 00 00 00 00 00 00 00 00 00 00 00 00 0180 00 00 00 00 00 00 00 00 00 00 00 00 018C 00 00 00 00 00 00 00 00 00 00 00 00 0198 00 00 00 00 00 00 00 00 00 00 00 00 01A4 00 00 00 00 00 00 00 00 00 00 00 00 01B0 00 00 00 00 00 00 00 00 00 00 00 00 01BC 00 00 00 00 00 00 00 00 00 00 00 00 01C8 00 00 00 00 00 00 00 00 00 00 00 00 01D4 00 00 00 00 00 00 00 00 00 00 00 00 01E0 00 00 00 00 00 00 00 00 02 14 18 01 01EC 00 00 00 00 80 80 8a 60 68 e5 b5 19 01F8 8a 00 00 00 00 00 00 00 01e8 itemCount [1 byte] 0x02 in this case 01e9 maxItemCount [1 byte] 0x14 constant 01ea itemSize [1 byte] 0x18 constant 01eb nodeLevel [1 byte] 0x01 in this case 01f8 backPointer [8 bytes] 0x00008a in this case ]]></literallayout> <para> The itemCount specifies the number of 24 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a triple of (I_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and I_ID is the lowest I_ID value in the subtree. </para> </refsect1> <refsect1 id='pst.file.leaf1.32.5'> <title>32 bit Index 1 Leaf Node</title> <para> The 32 bit index1 b-tree leaf nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 04 00 00 00 00 58 00 00 64 00 0f 00 000c 08 00 00 00 80 58 00 00 ac 00 06 00 0018 0c 00 00 00 40 59 00 00 ac 00 06 00 0024 10 00 00 00 00 5a 00 00 bc 00 03 00 0030 14 00 00 00 00 5b 00 00 a4 00 02 00 003c 18 00 00 00 c0 5b 00 00 64 00 02 00 0048 1c 00 00 00 40 5c 00 00 5c 00 02 00 0054 50 00 00 00 80 62 00 00 60 00 02 00 0060 74 00 00 00 00 77 00 00 5e 00 02 00 006c 7c 00 00 00 80 77 00 00 66 00 02 00 0078 84 00 00 00 00 76 00 00 ca 00 02 00 0084 88 00 00 00 00 63 00 00 52 00 02 00 0090 90 00 00 00 00 79 00 00 58 00 02 00 009c cc 00 00 00 c0 61 00 00 76 00 02 00 00a8 e0 00 00 00 00 61 00 00 74 00 02 00 00b4 f4 00 00 00 80 65 00 00 6e 00 02 00 00c0 8c 01 00 00 40 60 00 00 70 00 02 00 00cc ea 01 00 00 80 61 00 00 10 00 02 00 00d8 ec 01 00 00 40 8a 00 00 f3 01 02 00 00e4 f0 01 00 00 80 93 00 00 f4 1f 02 00 00f0 fa 01 00 00 c0 7f 00 00 10 00 02 00 00fc 00 02 00 00 00 89 00 00 34 01 02 00 0108 1c 02 00 00 40 ec 00 00 12 06 02 00 0114 22 02 00 00 00 84 00 00 10 00 02 00 0120 24 02 00 00 c0 ea 00 00 3c 01 02 00 012c 40 02 00 00 00 f4 00 00 0a 06 02 00 0138 46 02 00 00 40 8c 00 00 10 00 02 00 0144 48 02 00 00 80 f2 00 00 36 01 02 00 0150 64 02 00 00 80 fb 00 00 bf 07 02 00 015c 6a 02 00 00 80 63 00 00 10 00 02 00 0168 6c 02 00 00 40 fa 00 00 2a 01 02 00 0174 6c 02 00 00 40 fa 00 00 2a 01 02 00 0180 6c 02 00 00 40 fa 00 00 2a 01 02 00 018c 6c 02 00 00 40 fa 00 00 2a 01 02 00 0198 6c 02 00 00 40 fa 00 00 2a 01 02 00 01a4 6c 02 00 00 40 fa 00 00 2a 01 02 00 01b0 64 02 00 00 80 fb 00 00 bf 07 02 00 01bc 64 02 00 00 80 fb 00 00 bf 07 02 00 01c8 64 02 00 00 80 fb 00 00 bf 07 02 00 01d4 64 02 00 00 80 fb 00 00 bf 07 02 00 01e0 64 02 00 00 80 fb 00 00 bf 07 02 00 01ec 00 00 00 00 1f 29 0c 00 80 80 5b b3 01f8 5a 67 01 00 4f ae 70 a7 01f0 itemCount [1 byte] 0x1f in this case 01f1 maxItemCount [1 byte] 0x29 constant 01f2 itemSize [1 byte] 0x0c constant 01f3 nodeLevel [1 byte] 0x00 defines a leaf node 01f8 backPointer [4 bytes] 0x01675a in this case ]]></literallayout> <para> The itemCount specifies the number of 12 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a tuple of (I_ID, offset, size, unknown) The two low order bits of the I_ID value seem to be flags. I have never seen a case with bit zero set. Bit one indicates that the item is <emphasis>not</emphasis> encrypted. Note that references to these I_ID values elsewhere may have the low order bit set (and I don't know what that means), but when we do the search in this tree we need to clear that bit so that we can find the correct item. </para> </refsect1> <refsect1 id='pst.file.leaf1.64.5'> <title>64 bit Index 1 Leaf Node</title> <para> The 64 bit index1 b-tree leaf nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 04 00 00 00 00 00 00 00 00 58 00 00 000C 00 00 00 00 6c 00 05 00 00 00 00 00 0018 08 00 00 00 00 00 00 00 80 58 00 00 0024 00 00 00 00 b4 00 06 00 d8 22 37 08 0030 0c 00 00 00 00 00 00 00 80 59 00 00 003C 00 00 00 00 ac 00 07 00 d8 22 37 08 0048 10 00 00 00 00 00 00 00 40 5a 00 00 0054 00 00 00 00 bc 00 03 00 d8 22 37 08 0060 14 00 00 00 00 00 00 00 40 5b 00 00 006C 00 00 00 00 a4 00 02 00 d8 22 37 08 0078 18 00 00 00 00 00 00 00 00 5c 00 00 0084 00 00 00 00 64 00 02 00 d8 22 37 08 0090 1c 00 00 00 00 00 00 00 80 5c 00 00 009C 00 00 00 00 5c 00 02 00 d8 22 37 08 00A8 24 00 00 00 00 00 00 00 80 5d 00 00 00B4 00 00 00 00 72 00 02 00 d8 22 37 08 00C0 34 00 00 00 00 00 00 00 00 70 00 00 00CC 00 00 00 00 8c 00 02 00 00 0d 00 00 00D8 38 00 00 00 00 00 00 00 c0 71 00 00 00E4 00 00 00 00 5c 00 02 00 d8 22 9c 00 00F0 40 00 00 00 00 00 00 00 40 72 00 00 00FC 00 00 00 00 26 00 02 00 d8 22 9c 00 0108 4c 00 00 00 00 00 00 00 80 5f 00 00 0114 00 00 00 00 3e 00 02 00 d8 22 9c 00 0120 5c 00 00 00 00 00 00 00 c0 76 00 00 012C 00 00 00 00 8c 00 02 00 d8 22 9c 00 0138 64 00 00 00 00 00 00 00 40 75 00 00 0144 00 00 00 00 76 00 02 00 d8 22 9c 00 0150 6c 00 00 00 00 00 00 00 c0 73 00 00 015C 00 00 00 00 5e 00 02 00 d8 22 9c 00 0168 70 00 00 00 00 00 00 00 80 72 00 00 0174 00 00 00 00 1e 01 02 00 d8 22 9c 00 0180 70 00 00 00 00 00 00 00 80 72 00 00 018C 00 00 00 00 1e 01 02 00 d8 22 9c 00 0198 70 00 00 00 00 00 00 00 80 72 00 00 01A4 00 00 00 00 1e 01 02 00 d8 22 9c 00 01B0 74 00 00 00 00 00 00 00 40 74 00 00 01BC 00 00 00 00 e0 00 02 00 d8 22 9c 00 01C8 7c 00 00 00 00 00 00 00 80 77 00 00 01D4 00 00 00 00 dc 00 02 00 d8 22 9c 00 01E0 00 00 00 00 00 00 00 00 10 14 18 00 01EC 00 00 00 00 80 80 88 48 3f 50 0b 04 01F8 88 00 00 00 00 00 00 00 01e8 itemCount [1 byte] 0x10 in this case 01e9 maxItemCount [1 byte] 0x14 constant 01ea itemSize [1 byte] 0x18 constant 01eb nodeLevel [1 byte] 0x00 defines a leaf node 01f8 backPointer [8 bytes] 0x000088 in this case ]]></literallayout> <para> The itemCount specifies the number of 24 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a tuple of (I_ID, offset, size, unknown) The two low order bits of the I_ID value seem to be flags. I have never seen a case with bit zero set. Bit one indicates that the item is <emphasis>not</emphasis> encrypted. Note that references to these I_ID values elsewhere may have the low order bit set (and I don't know what that means), but when we do the search in this tree we need to clear that bit so that we can find the correct item. </para> </refsect1> <refsect1 id='pst.file.node2.32.5'> <title>32 bit Index 2 Node</title> <para> The 32 bit index2 b-tree nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 00 00 00 bb 1e 02 00 00 e2 0b 00 000c 64 78 20 00 8c 1e 02 00 00 dc 0b 00 0018 00 00 00 00 00 00 00 00 00 00 00 00 0024 00 00 00 00 00 00 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 00 00 00 00 003c 00 00 00 00 00 00 00 00 00 00 00 00 0048 00 00 00 00 00 00 00 00 00 00 00 00 0054 00 00 00 00 00 00 00 00 00 00 00 00 0060 00 00 00 00 00 00 00 00 00 00 00 00 006c 00 00 00 00 00 00 00 00 00 00 00 00 0078 00 00 00 00 00 00 00 00 00 00 00 00 0084 00 00 00 00 00 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 009c 00 00 00 00 00 00 00 00 00 00 00 00 00a8 00 00 00 00 00 00 00 00 00 00 00 00 00b4 00 00 00 00 00 00 00 00 00 00 00 00 00c0 00 00 00 00 00 00 00 00 00 00 00 00 00cc 00 00 00 00 00 00 00 00 00 00 00 00 00d8 00 00 00 00 00 00 00 00 00 00 00 00 00e4 00 00 00 00 00 00 00 00 00 00 00 00 00f0 00 00 00 00 00 00 00 00 00 00 00 00 00fc 00 00 00 00 00 00 00 00 00 00 00 00 0108 00 00 00 00 00 00 00 00 00 00 00 00 0114 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 012c 00 00 00 00 00 00 00 00 00 00 00 00 0138 00 00 00 00 00 00 00 00 00 00 00 00 0144 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 015c 00 00 00 00 00 00 00 00 00 00 00 00 0168 00 00 00 00 00 00 00 00 00 00 00 00 0174 00 00 00 00 00 00 00 00 00 00 00 00 0180 00 00 00 00 00 00 00 00 00 00 00 00 018c 00 00 00 00 00 00 00 00 00 00 00 00 0198 00 00 00 00 00 00 00 00 00 00 00 00 01a4 00 00 00 00 00 00 00 00 00 00 00 00 01b0 00 00 00 00 00 00 00 00 00 00 00 00 01bc 00 00 00 00 00 00 00 00 00 00 00 00 01c8 00 00 00 00 00 00 00 00 00 00 00 00 01d4 00 00 00 00 00 00 00 00 00 00 00 00 01e0 00 00 00 00 00 00 00 00 00 00 00 00 01ec 00 00 00 00 02 29 0c 02 81 81 b2 60 01f8 bc 1e 02 00 7e 70 dc e3 01f0 itemCount [1 byte] 0x02 in this case 01f1 maxItemCount [1 byte] 0x29 constant 01f2 itemSize [1 byte] 0x0c constant 01f3 nodeLevel [1 byte] 0x02 in this case 01f8 backPointer [4 bytes] 0x021ebc in this case ]]></literallayout> <para> The itemCount specifies the number of 12 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a triple of (D_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and D_ID is the lowest D_ID value in the subtree. </para> </refsect1> <refsect1 id='pst.file.node2.64.5'> <title>64 bit Index 2 Node</title> <para> The 64 bit index2 b-tree nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 00 00 00 00 00 00 00 77 00 00 00 000C 00 00 00 00 00 56 00 00 00 00 00 00 0018 4c 06 00 00 00 00 00 00 82 00 00 00 0024 00 00 00 00 00 68 00 00 00 00 00 00 0030 4f 80 00 00 00 00 00 00 84 00 00 00 003C 00 00 00 00 00 6e 00 00 00 00 00 00 0048 00 00 00 00 00 00 00 00 00 00 00 00 0054 00 00 00 00 00 00 00 00 00 00 00 00 0060 00 00 00 00 00 00 00 00 00 00 00 00 006C 00 00 00 00 00 00 00 00 00 00 00 00 0078 00 00 00 00 00 00 00 00 00 00 00 00 0084 00 00 00 00 00 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 009C 00 00 00 00 00 00 00 00 00 00 00 00 00A8 00 00 00 00 00 00 00 00 00 00 00 00 00B4 00 00 00 00 00 00 00 00 00 00 00 00 00C0 00 00 00 00 00 00 00 00 00 00 00 00 00CC 00 00 00 00 00 00 00 00 00 00 00 00 00D8 00 00 00 00 00 00 00 00 00 00 00 00 00E4 00 00 00 00 00 00 00 00 00 00 00 00 00F0 00 00 00 00 00 00 00 00 00 00 00 00 00FC 00 00 00 00 00 00 00 00 00 00 00 00 0108 00 00 00 00 00 00 00 00 00 00 00 00 0114 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 012C 00 00 00 00 00 00 00 00 00 00 00 00 0138 00 00 00 00 00 00 00 00 00 00 00 00 0144 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 015C 00 00 00 00 00 00 00 00 00 00 00 00 0168 00 00 00 00 00 00 00 00 00 00 00 00 0174 00 00 00 00 00 00 00 00 00 00 00 00 0180 00 00 00 00 00 00 00 00 00 00 00 00 018C 00 00 00 00 00 00 00 00 00 00 00 00 0198 00 00 00 00 00 00 00 00 00 00 00 00 01A4 00 00 00 00 00 00 00 00 00 00 00 00 01B0 00 00 00 00 00 00 00 00 00 00 00 00 01BC 00 00 00 00 00 00 00 00 00 00 00 00 01C8 00 00 00 00 00 00 00 00 00 00 00 00 01D4 00 00 00 00 00 00 00 00 00 00 00 00 01E0 00 00 00 00 00 00 00 00 03 14 18 01 01EC 00 00 00 00 81 81 83 6a 49 da f3 d3 01F8 83 00 00 00 00 00 00 00 01e8 itemCount [1 byte] 0x03 in this case 01e9 maxItemCount [1 byte] 0x14 constant 01ea itemSize [1 byte] 0x18 constant 01eb nodeLevel [1 byte] 0x01 in this case 01f8 backPointer [8 bytes] 0x000083 in this case ]]></literallayout> <para> The itemCount specifies the number of 24 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a triple of (D_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and D_ID is the lowest D_ID value in the subtree. </para> </refsect1> <refsect1 id='pst.file.leaf2.32.5'> <title>32 bit Index 2 Leaf Node</title> <para> The 32 bit index2 b-tree leaf nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 00 00 00 38 e6 00 00 00 00 00 00 00 00 00 00 0010 61 00 00 00 2c a8 02 00 36 a8 02 00 00 00 00 00 0020 22 01 00 00 20 a2 02 00 00 00 00 00 22 01 00 00 0030 2d 01 00 00 88 7b 03 00 00 00 00 00 00 00 00 00 0040 2e 01 00 00 08 00 00 00 00 00 00 00 00 00 00 00 0050 2f 01 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 0060 e1 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0070 01 02 00 00 b4 e4 02 00 00 00 00 00 00 00 00 00 0080 61 02 00 00 a0 e4 02 00 00 00 00 00 00 00 00 00 0090 0d 06 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00A0 0e 06 00 00 08 00 00 00 00 00 00 00 00 00 00 00 00B0 0f 06 00 00 0c 00 00 00 00 00 00 00 00 00 00 00 00C0 10 06 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00D0 2b 06 00 00 84 00 00 00 00 00 00 00 00 00 00 00 00E0 4c 06 00 00 1c 00 00 00 00 00 00 00 00 00 00 00 00F0 71 06 00 00 18 00 00 00 00 00 00 00 00 00 00 00 0100 92 06 00 00 14 00 00 00 00 00 00 00 00 00 00 00 0110 23 22 00 00 14 a0 02 00 00 00 00 00 22 01 00 00 0120 26 22 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0130 27 22 00 00 1c a0 02 00 00 00 00 00 00 00 00 00 0140 22 80 00 00 50 00 00 00 00 00 00 00 22 01 00 00 0150 2d 80 00 00 f8 9f 02 00 00 00 00 00 00 00 00 00 0160 2e 80 00 00 08 00 00 00 00 00 00 00 00 00 00 00 0170 2f 80 00 00 34 e6 00 00 00 00 00 00 00 00 00 00 0180 42 80 00 00 3c 6d 02 00 00 00 00 00 22 80 00 00 0190 4d 80 00 00 04 00 00 00 00 00 00 00 00 00 00 00 01A0 4e 80 00 00 10 6d 02 00 00 00 00 00 00 00 00 00 01B0 4f 80 00 00 ec 23 00 00 00 00 00 00 00 00 00 00 01C0 62 80 00 00 38 78 02 00 00 00 00 00 22 01 00 00 01D0 6d 80 00 00 34 78 02 00 00 00 00 00 00 00 00 00 01E0 6e 80 00 00 08 00 00 00 00 00 00 00 00 00 00 00 01F0 10 1f 10 00 81 81 a0 9a ae 1e 02 00 89 44 6a 0f 01f0 itemCount [1 byte] 0x10 in this case 01f1 maxItemCount [1 byte] 0x1f constant 01f2 itemSize [1 byte] 0x10 constant 01f3 nodeLevel [1 byte] 0x00 in this case 01f8 backPointer [4 bytes] 0x021eae in this case ]]></literallayout> <para> The itemCount specifies the number of 16 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a tuple of (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) The DESC-I_ID points to the main data for this item (Associated Descriptor Items 0x7cec, 0xbcec, or 0x0101) via the index1 tree. The TREE-I_ID is zero or points to an Associated Tree Item 0x0002 via the index1 tree. The PARENT-D_ID points to the parent of this item in this index2 tree. </para> </refsect1> <refsect1 id='pst.file.leaf2.64.5'> <title>64 bit Index 2 Leaf Node</title> <para> The 64 bit index2 b-tree leaf nodes are 512 byte blocks with the following format. </para> <literallayout class="monospaced"><![CDATA[ 0000 21 00 00 00 00 00 00 00 74 00 00 00 00 00 00 00 0010 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0020 61 00 00 00 00 00 00 00 34 00 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0040 22 01 00 00 00 00 00 00 4c 00 00 00 00 00 00 00 0050 00 00 00 00 00 00 00 00 22 01 00 00 02 00 00 00 0060 2d 01 00 00 00 00 00 00 70 00 00 00 00 00 00 00 0070 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0080 2e 01 00 00 00 00 00 00 08 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00A0 2f 01 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 00B0 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00C0 e1 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00D0 00 00 00 00 00 00 00 00 00 00 00 00 d8 e3 13 00 00E0 01 02 00 00 00 00 00 00 8c 00 00 00 00 00 00 00 00F0 00 00 00 00 00 00 00 00 00 00 00 00 b0 e3 13 00 0100 61 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0110 00 00 00 00 00 00 00 00 00 00 00 00 d8 e3 13 00 0120 0d 06 00 00 00 00 00 00 04 00 00 00 00 00 00 00 0130 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0140 0e 06 00 00 00 00 00 00 08 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0160 0f 06 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 0170 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 0180 10 06 00 00 00 00 00 00 10 00 00 00 00 00 00 00 0190 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 01A0 2b 06 00 00 00 00 00 00 24 00 00 00 00 00 00 00 01B0 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 01C0 71 06 00 00 00 00 00 00 18 00 00 00 00 00 00 00 01D0 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 01E0 00 00 00 00 00 00 00 00 0e 0f 20 00 00 00 00 00 01F0 81 81 77 56 f8 32 43 49 77 00 00 00 00 00 00 00 01e8 itemCount [1 byte] 0x0e in this case 01e9 maxItemCount [1 byte] 0x0f constant 01ea itemSize [1 byte] 0x20 constant 01eb nodeLevel [1 byte] 0x00 defines a leaf node 01f8 backPointer [8 bytes] 0x000077 in this case ]]></literallayout> <para> The itemCount specifies the number of 32 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. </para> <para> Each item in this node is a tuple of (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) The DESC-I_ID points to the main data for this item (Associated Descriptor Items 0x7cec, 0xbcec, or 0x0101) via the index1 tree. The TREE-I_ID is zero or points to an Associated Tree Item 0x0002 via the index1 tree. The PARENT-D_ID points to the parent of this item in this index2 tree. </para> </refsect1> <refsect1 id='pst.file.list.32.5'> <title>32 bit Associated Tree Item 0x0002</title> <para> A D_ID value may point to an entry in the index2 tree with a non-zero TREE-I_ID which points to this descriptor block via the index1 tree. It maps local ID2 values (referenced in the main data for the original D_ID item) to I_ID values. This descriptor block contains triples of (ID2, I_ID, CHILD-I_ID) where the local ID2 data can be found via I_ID, and CHILD-I_ID is either zero or it points to another Associated Tree Item via the index1 tree. </para> <para> In the above 32 bit leaf node, we have a tuple of (0x61, 0x02a82c, 0x02a836, 0) 0x02a836 is the I_ID of the associated tree, and we can lookup that I_ID value in the index1 b-tree to find the (offset,size) of the data in the .pst file. </para> <literallayout class="monospaced"><![CDATA[ 0000 02 00 01 00 9f 81 00 00 30 a8 02 00 00 00 00 00 0000 signature [2 bytes] 0x0002 constant 0002 count [2 bytes] 0x0001 in this case repeating 0004 id2 [4 bytes] 0x00819f in this case 0008 i_id [4 bytes] 0x02a830 in this case 000c child-i_id [4 bytes] 0 in this case ]]></literallayout> </refsect1> <refsect1 id='pst.file.list.64.5'> <title>64 bit Associated Tree Item 0x0002</title> <para> This descriptor block contains a tree that maps local ID2 values to I_ID entries, similar to the 32 bit version described above. </para> <literallayout class="monospaced"><![CDATA[ 0000 02 00 02 00 00 00 00 00 92 06 00 00 00 00 00 00 0010 a8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0020 3f 80 00 00 00 00 00 00 98 00 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 0000 signature [2 bytes] 0x0002 constant 0002 count [2 bytes] 0x0002 in this case 0004 unknown [4 bytes] 0 possibly constant repeating 0008 id2 [4 bytes] 0x000692 in this case 000c unknown1 [2 bytes] 0 may be a count or size 000e unknown2 [2 bytes] 0 may be a count or size 0010 i_id [8 bytes] 0x0000a8 in this case 0018 child-i_id [8 bytes] 0 in this case ]]></literallayout> </refsect1> <refsect1 id='pst.file.desc.5'> <title>Associated Descriptor Item 0xbcec</title> <para> Contains information about the item, which may be email, contact, or other outlook types. In the above leaf node, we have a tuple of (0x21, 0x00e638, 0, 0) 0x00e638 is the I_ID of the associated descriptor, and we can lookup that I_ID value in the index1 b-tree to find the (offset,size) of the data in the .pst file. This descriptor is eventually decoded to a list of MAPI elements. </para> <literallayout class="monospaced"><![CDATA[ 0000 3c 01 ec bc 20 00 00 00 00 00 00 00 b5 02 06 00 0010 40 00 00 00 f9 0f 02 01 60 00 00 00 01 30 1e 00 0020 80 00 00 00 04 30 1e 00 00 00 00 00 df 35 03 00 0030 ff 00 00 00 e0 35 02 01 a0 00 00 00 e2 35 02 01 0040 e0 00 00 00 e3 35 02 01 c0 00 00 00 e4 35 02 01 0050 00 01 00 00 e5 35 02 01 20 01 00 00 e6 35 02 01 0060 40 01 00 00 e7 35 02 01 60 01 00 00 1e 66 0b 00 0070 00 00 00 00 ff 67 03 00 00 00 00 00 d2 7f 17 d8 0080 64 8c d5 11 83 24 00 50 04 86 95 45 53 74 61 6e 0090 6c 65 79 00 00 00 00 d2 7f 17 d8 64 8c d5 11 83 00A0 24 00 50 04 86 95 45 22 80 00 00 00 00 00 00 d2 00B0 7f 17 d8 64 8c d5 11 83 24 00 50 04 86 95 45 42 00C0 80 00 00 00 00 00 00 d2 7f 17 d8 64 8c d5 11 83 00D0 24 00 50 04 86 95 45 a2 80 00 00 00 00 00 00 d2 00E0 7f 17 d8 64 8c d5 11 83 24 00 50 04 86 95 45 c2 00F0 80 00 00 00 00 00 00 d2 7f 17 d8 64 8c d5 11 83 0100 24 00 50 04 86 95 45 e2 80 00 00 00 00 00 00 d2 0110 7f 17 d8 64 8c d5 11 83 24 00 50 04 86 95 45 02 0120 81 00 00 00 00 00 00 d2 7f 17 d8 64 8c d5 11 83 0130 24 00 50 04 86 95 45 62 80 00 00 00 0b 00 00 00 0140 0c 00 14 00 7c 00 8c 00 93 00 ab 00 c3 00 db 00 0150 f3 00 0b 01 23 01 3b 01 0000 indexOffset [2 bytes] 0x013c in this case 0002 signature [2 bytes] 0xbcec constant 0004 b5offset [4 bytes] 0x0020 index reference ]]></literallayout> <para> Note the signature of 0xbcec. There are other descriptor block formats with other signatures. Note the indexOffset of 0x013c - starting at that position in the descriptor block, we have an array of two byte integers. The first integer (0x000b) is a (count-1) of the number of overlapping pairs following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) and the last (12th) pair is (0x123, 0x13b). These pairs are (start,end+1) offsets of items in this block. So we have count+2 integers following the count value. </para> <para> Note the b5offset of 0x0020, which is a type that I will call an index reference. Such index references have at least two different forms, and may point to data either in this block, or in some other block. External pointer references have the low order 4 bits all set, and are ID2 values that can be used to fetch data. This value of 0x0020 is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. </para> <para> So far we have only described internal index references where the high order 16 bits are zero. That suffices for single descriptor blocks. But in the case of the type 0x0101 descriptor block, we have an array of subblocks. In this case, the high order 16 bits of an internal index reference are used to select the subblock. Each subblock starts with a 16 bit indexOffset which points to the count and array of 16 bit integer pairs which are offsets in the current subblock. </para> <para> Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x02b5 constant 0002 datasize [2 bytes] 0x0006 constant +2 for 8 byte entries 0004 descoffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> Note the descoffset of 0x0040, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0x7c) pair. The datasize (6) plus the b5 code (02) gives the size of the entries, in this case 8 bytes. We now have the offset 0x14 of the descriptor array, composed of 8 byte entries that describe MAPI elements. Each descriptor entry has the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 itemType [2 bytes] 0002 referenceType [2 bytes] 0004 value [4 bytes] ]]></literallayout> <para> For some reference types (2, 3, 0xb) the value is used directly. Otherwise, the value is an index reference, which is either an ID2 value, or an offset, to be right shifted by 4 bits and used to fetch a pair from the index table to find the offset and size of the item in this descriptor block. </para> <para> The following reference types are known, but not all of these are implemented in the code yet. </para> <literallayout class="monospaced"><![CDATA[ 0x0002 - Signed 16bit value 0x0003 - Signed 32bit value 0x0004 - 4-byte floating point 0x0005 - Floating point double 0x0006 - Signed 64-bit int 0x0007 - Application Time 0x000A - 32-bit error value 0x000B - Boolean (non-zero = true) 0x000D - Embedded Object 0x0014 - 8-byte signed integer (64-bit) 0x001E - Null terminated String 0x001F - Unicode string 0x0040 - Systime - Filetime structure 0x0048 - OLE Guid 0x0102 - Binary data 0x1003 - Array of 32bit values 0x1014 - Array of 64bit values 0x101E - Array of Strings 0x1102 - Array of Binary data ]]></literallayout> <para> The following item types are known, but not all of these are implemented in the code yet. </para> <literallayout class="monospaced"><![CDATA[ 0x0002 Alternate recipient allowed 0x0003 Extended Attributes Table 0x0017 Importance Level 0x001a IPM Context, message class 0x0023 Global delivery report requested 0x0026 Priority 0x0029 Read Receipt 0x002b Reassignment Prohibited 0x002e Original Sensitivity 0x0032 Report time 0x0036 Sensitivity 0x0037 Email Subject 0x0039 Client submit time / date sent 0x003b Outlook Address of Sender 0x003f Outlook structure describing the recipient 0x0040 Name of the Outlook recipient structure 0x0041 Outlook structure describing the sender 0x0042 Name of the Outlook sender structure 0x0043 Another structure describing the recipient 0x0044 Name of the second recipient structure 0x004f Reply-To Outlook Structure 0x0050 Name of the Reply-To structure 0x0051 Outlook Name of recipient 0x0052 Second Outlook name of recipient 0x0057 My address in TO field 0x0058 My address in CC field 0x0059 Message addressed to me 0x0063 Response requested 0x0064 Sender's Address access method (SMTP, EX) 0x0065 Sender's Address 0x0070 Conversation topic, processed subject (with Fwd:, Re, ... removed) 0x0071 Conversation index 0x0072 Original display BCC 0x0073 Original display CC 0x0074 Original display TO 0x0075 Recipient Address Access Method (SMTP, EX) 0x0076 Recipient's Address 0x0077 Second Recipient Access Method (SMTP, EX) 0x0078 Second Recipient Address 0x007d Email Header. This is the header that was attached to the email 0x0c04 NDR Reason code 0x0c05 NDR Diag code 0x0c06 Non-receipt notification requested 0x0c17 Reply Requested 0x0c19 Second sender structure 0x0c1a Name of second sender structure 0x0c1b Supplementary info 0x0c1d Second outlook name of sender 0x0c1e Second sender access method (SMTP, EX) 0x0c1f Second Sender Address 0x0c20 NDR status code 0x0e01 Delete after submit 0x0e02 BCC Addresses 0x0e03 CC Addresses 0x0e04 SentTo Address 0x0e06 Date. 0x0e07 Flag bits 0x01 - Read 0x02 - Unmodified 0x04 - Submit 0x08 - Unsent 0x10 - Has Attachments 0x20 - From Me 0x40 - Associated 0x80 - Resend 0x100 - RN Pending 0x200 - NRN Pending 0x0e08 Message Size 0x0e0a Sentmail EntryID 0x0e1d Normalized subject 0x0e1f Compressed RTF in Sync 0x0e20 Attachment Size 0x0ff9 binary record header 0x1000 Plain Text Email Body. Does not exist if the email doesn't have a plain text version 0x1001 Report Text 0x1006 RTF Sync Body CRC 0x1007 RTF Sync Body character count 0x1008 RTF Sync body tag 0x1009 RTF Compressed body 0x1010 RTF whitespace prefix count 0x1011 RTF whitespace tailing count 0x1013 HTML Email Body. Does not exist if the email doesn't have an HTML version 0x1035 Message ID 0x1042 In-Reply-To or Parent's Message ID 0x1046 Return Path 0x3001 Folder Name? I have also seen this value used for the contacts record 0x3002 Address Type 0x3003 Contact Address 0x3004 Comment 0x3007 Date item creation 0x3008 Date item modification 0x300b binary record header 0x35df Valid Folder Mask 0x35e0 binary record contains a reference to "Top of Personal Folder" item 0x35e2 binary record contains a reference to default outbox item 0x35e3 binary record contains a reference to "Deleted Items" item 0x35e4 binary record contains a reference to sent items folder item 0x35e5 binary record contains a reference to user views folder item 0x35e6 binary record contains a reference to common views folder item 0x35e7 binary record contains a reference to "Search Root" item 0x3602 the number of emails stored in a folder 0x3603 the number of unread emails in a folder 0x360a Has Subfolders 0x3613 the folder content description 0x3617 Associate Content count 0x3701 Binary Data attachment 0x3704 Attachment Filename 0x3705 Attachement method 0x3707 Attachment Filename long 0x370b Attachment Position 0x370e Attachment mime encoding 0x3710 Attachment mime Sequence 0x3712 Content ID 0x3a00 Contact's Account name 0x3a01 Contact Alternate Recipient 0x3a02 Callback telephone number 0x3a03 Message Conversion Prohibited 0x3a05 Contacts Suffix 0x3a06 Contacts First Name 0x3a07 Contacts Government ID Number 0x3a08 Business Telephone Number 0x3a09 Home Telephone Number 0x3a0a Contacts Initials 0x3a0b Keyword 0x3a0c Contact's Language 0x3a0d Contact's Location 0x3a0e Mail Permission 0x3a0f MHS Common Name 0x3a10 Organizational ID # 0x3a11 Contacts Surname 0x3a12 original entry id 0x3a13 original display name 0x3a14 original search key 0x3a15 Default Postal Address 0x3a16 Company Name 0x3a17 Job Title 0x3a18 Department Name 0x3a19 Office Location 0x3a1a Primary Telephone 0x3a1b Business Phone Number 2 0x3a1c Mobile Phone Number 0x3a1d Radio Phone Number 0x3a1e Car Phone Number 0x3a1f Other Phone Number 0x3a20 Transmittable Display Name 0x3a21 Pager Phone Number 0x3a22 user certificate 0x3a23 Primary Fax Number 0x3a24 Business Fax Number 0x3a25 Home Fax Number 0x3a26 Business Address Country 0x3a27 Business Address City 0x3a28 Business Address State 0x3a29 Business Address Street 0x3a2a Business Postal Code 0x3a2b Business PO Box 0x3a2c Telex Number 0x3a2d ISDN Number 0x3a2e Assistant Phone Number 0x3a2f Home Phone 2 0x3a30 Assistant's Name 0x3a40 Can receive Rich Text 0x3a41 Wedding Anniversary 0x3a42 Birthday 0x3a43 Hobbies 0x3a44 Middle Name 0x3a45 Display Name Prefix (Title) 0x3a46 Profession 0x3a47 Preferred By Name 0x3a48 Spouse's Name 0x3a49 Computer Network Name 0x3a4a Customer ID 0x3a4b TTY/TDD Phone 0x3a4c Ftp Site 0x3a4d Gender 0x3a4e Manager's Name 0x3a4f Nickname 0x3a50 Personal Home Page 0x3a51 Business Home Page 0x3a57 Company Main Phone 0x3a58 childrens names 0x3a59 Home Address City 0x3a5a Home Address Country 0x3a5b Home Address Postal Code 0x3a5c Home Address State or Province 0x3a5d Home Address Street 0x3a5e Home Address Post Office Box 0x3a5f Other Address City 0x3a60 Other Address Country 0x3a61 Other Address Postal Code 0x3a62 Other Address State 0x3a63 Other Address Street 0x3a64 Other Address Post Office box 0x3fde Internet code page 0x3ffd Message code page 0x65e3 Entry ID 0x67f2 Attachment ID2 value 0x67ff Password checksum 0x6f02 Secure HTML Body 0x6f04 Secure Text Body 0x7c07 Top of folders RecID 0x8005 Contact Fullname 0x801a Home Address 0x801b Business Address 0x801c Other Address 0x8045 Work Address Street 0x8046 Work Address City 0x8047 Work Address State 0x8048 Work Address Postal Code 0x8049 Work Address Country 0x804a Work Address Post Office Box 0x8082 Email Address 1 Transport 0x8083 Email Address 1 Address 0x8084 Email Address 1 Description 0x8085 Email Address 1 Record 0x8092 Email Address 2 Transport 0x8093 Email Address 2 Address 0x8094 Email Address 2 Description 0x8095 Email Address 2 Record 0x80a2 Email Address 3 Transport 0x80a3 Email Address 3 Address 0x80a4 Email Address 3 Description 0x80a5 Email Address 3 Record 0x80d8 Internet Free/Busy 0x8205 Appointment shows as 0x8208 Appointment Location 0x820d Appointment start 0x820e Appointment end 0x8214 Label for appointment 0x8215 All day appointment flag 0x8216 Appointment recurrence data 0x8223 Appointment is recurring 0x8231 Recurrence type 0x8232 Recurrence description 0x8234 TimeZone of times 0x8235 Recurrence Start Time 0x8236 Recurrence End Time 0x8501 Reminder minutes before appointment start 0x8503 Reminder alarm 0x8516 Common Time Start 0x8517 Common Time End 0x851f Play reminder sound filename 0x8530 Followup String 0x8534 Mileage 0x8535 Billing Information 0x8554 Outlook Version 0x8560 Appointment Reminder Time 0x8700 Journal Entry Type 0x8706 Start Timestamp 0x8708 End Timestamp 0x8712 Journal Entry Type - duplicate? ]]></literallayout> </refsect1> <refsect1 id='pst.file.desc2.5'> <title>Associated Descriptor Item 0x7cec</title> <para> This style of descriptor block is similar to the 0xbcec format. This descriptor is also eventually decoded to a list of MAPI elements. </para> <literallayout class="monospaced"><![CDATA[ 0000 7a 01 ec 7c 40 00 00 00 00 00 00 00 b5 04 02 00 0010 60 00 00 00 7c 18 60 00 60 00 62 00 65 00 20 00 0020 00 00 80 00 00 00 00 00 00 00 03 00 20 0e 0c 00 0030 04 03 1e 00 01 30 2c 00 04 0b 1e 00 03 37 28 00 0040 04 0a 1e 00 04 37 14 00 04 05 03 00 05 37 10 00 0050 04 04 1e 00 07 37 24 00 04 09 1e 00 08 37 20 00 0060 04 08 02 01 0a 37 18 00 04 06 03 00 0b 37 08 00 0070 04 02 1e 00 0d 37 1c 00 04 07 1e 00 0e 37 40 00 0080 04 10 02 01 0f 37 30 00 04 0c 1e 00 11 37 34 00 0090 04 0d 1e 00 12 37 3c 00 04 0f 1e 00 13 37 38 00 00A0 04 0e 03 00 f2 67 00 00 04 00 03 00 f3 67 04 00 00B0 04 01 03 00 09 69 44 00 04 11 03 00 fa 7f 5c 00 00C0 04 15 40 00 fb 7f 4c 00 08 13 40 00 fc 7f 54 00 00D0 08 14 03 00 fd 7f 48 00 04 12 0b 00 fe 7f 60 00 00E0 01 16 0b 00 ff 7f 61 00 01 17 45 82 00 00 00 00 00F0 45 82 00 00 78 3c 00 00 ff ff ff ff 49 1e 00 00 0100 06 00 00 00 00 00 00 00 a0 00 00 00 00 00 00 00 0110 00 00 00 00 00 00 00 00 00 00 00 00 c0 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 40 dd a3 0140 57 45 b3 0c 00 40 dd a3 57 45 b3 0c 02 00 00 00 0150 00 00 fa 10 3e 2a 86 48 86 f7 14 03 0a 03 02 01 0160 4a 2e 20 44 61 76 69 64 20 4b 61 72 61 6d 27 73 0170 20 42 69 72 74 68 64 61 79 00 06 00 00 00 0c 00 0180 14 00 ea 00 f0 00 55 01 60 01 79 01 0000 indexOffset [2 bytes] 0x017a in this case 0002 signature [2 bytes] 0x7cec constant 0004 7coffset [4 bytes] 0x0040 index reference ]]></literallayout> <para> Note the signature of 0x7cec. There are other descriptor block formats with other signatures. Note the indexOffset of 0x017a - starting at that position in the descriptor block, we have an array of two byte integers. The first integer (0x0006) is a (count-1) of the number of overlapping pairs following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) and the last (7th) pair is (0x160, 0x179). These pairs are (start,end+1) offsets of items in this block. So we have count+2 integers following the count value. </para> <para> Note the 7coffset of 0x0040, which is an index reference. In this case, it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0xea) pair. We have the offset and size of the "7c" block located at offset 0x14 with a size of 214 bytes in this case. The "7c" block starts with a header with the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 signature [1 bytes] 0x7c constant 0001 itemCount [1 bytes] 0x18 in this case 0002 unknown [2 bytes] 0x0060 in this case 0004 unknown [2 bytes] 0x0060 in this case 0006 unknown [2 bytes] 0x0062 in this case 0008 recordSize [2 bytes] 0x0065 in this case 000a b5Offset [4 bytes] 0x0020 index reference 000e index2Offset [4 bytes] 0x0080 index reference 0012 unknown [2 bytes] 0x0000 in this case 0014 unknown [2 bytes] 0x0000 in this case ]]></literallayout> <para> Note the b5Offset of 0x0020, which is an index reference. In this case, it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 signature [2 bytes] 0x04b5 constant 0002 datasize [2 bytes] 0x0002 +4 for 6 byte entries in this case 0004 descoffset [4 bytes] 0x0060 index reference ]]></literallayout> <para> Note the descoffset of 0x0060, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0006, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xea, 0xf0) pair. The datasize (2) plus the b5 code (04) gives the size of the entries, in this case 6 bytes. We now have the offset 0xea of an unused block of data in an unknown format, composed of 6 byte entries. That gives us (0xf0 - 0xea)/6 = 1, so we have a recordCount of one. </para> <para> We have seen cases where the descoffset in the b5 block is zero, and the index2Offset in the 7c block is zero. This has been seen for objects that seem to be attachments on messages that have been read. Before the message was read, it did not have any attachments. </para> <para> Note the index2Offset above of 0x0080, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0008, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xf0, 0x155) pair. This is an array of tables of four byte integers. We will call these the IND2 tables. The size of each of these tables is specified by the recordSize field of the "7c" header. The number of these tables is the above recordCount value derived from the "b5" block. </para> <para> Now the remaining data in the "7c" block after the header starts at offset 0x2a. There should be itemCount 8 byte items here, with the following format: </para> <literallayout class="monospaced"><![CDATA[ 0000 referenceType [2 bytes] 0002 itemType [2 bytes] 0004 ind2Offset [2 bytes] 0006 size [1 byte] 0007 unknown [1 byte] ]]></literallayout> <para> The ind2Offset is a byte offset into the current IND2 table of some value. If that is a four byte integer value, then once we fetch that, we have the same triple (item type, reference type, value) as we find in the 0xbcec style descriptor blocks. If not, then this value is used directly. These 8 byte descriptors are processed recordCount times, each time using the next IND2 table. The item and reference types are as described above for the 0xbcec format descriptor block. </para> </refsect1> <refsect1 id='pst.file.desc3.32.5'> <title>32 bit Associated Descriptor Item 0x0101</title> <para> This descriptor block contains a list of I_ID values. It is used when an I_ID (that would normally point to a type 0x7cec or 0xbcec descriptor block) contains more data than can fit in any single descriptor of those types. In this case, it points to a type 0x0101 block, which contains a list of I_ID values that themselves point to the actual descriptor blocks. The total length value in the 0x0101 header is the sum of the lengths of the blocks pointed to by the list of I_ID values. The result is an array of subblocks, that may contain index references where the high order 16 bits specify which descriptor subblock to use. Only the first descriptor subblock contains the signature (0xbcec or 0x7cec). </para> <literallayout class="monospaced"><![CDATA[ 0000 01 01 02 00 26 28 00 00 18 77 0c 00 b8 04 00 00 0000 signature [2 bytes] 0x0101 constant 0002 count [2 bytes] 0x0002 in this case 0004 total length [4 bytes] 0x002826 in this case repeating 0008 i_id [4 bytes] 0x0c7718 in this case 000c i_id [4 bytes] 0x0004b8 in this case ]]></literallayout> </refsect1> <refsect1 id='pst.file.desc3.64.5'> <title>64 bit Associated Descriptor Item 0x0101</title> <para> This descriptor block contains a list of I_ID values, similar to the 32 bit version described above. </para> <literallayout class="monospaced"><![CDATA[ 0000 01 01 02 00 ea 29 00 00 10 83 00 00 00 00 00 00 0010 1c 83 00 00 00 00 00 00 0000 signature [2 bytes] 0x0101 constant 0002 count [2 bytes] 0x0002 in this case 0004 total length [4 bytes] 0x0029ea in this case repeating 0008 i_id [8 bytes] 0x008310 in this case 0010 i_id [8 bytes] 0x00831c in this case ]]></literallayout> </refsect1> </refentry> </reference>