diff --git a/libpst.spec.in b/libpst.spec.in index 4c76b37..aa66a8e 100644 --- a/libpst.spec.in +++ b/libpst.spec.in @@ -1,698 +1,698 @@ %if "%{?dist}" == ".el8" %define fedora 32 %endif %if 0%{?fedora} > 27 || 0%{?rhel} >= 9 %global use_python3 1 %define __python %{__python3} %endif %if 0%{?rhel} >= 9 %global with_dii 0 %else %global with_dii 1 %endif Summary: Utilities to convert Outlook .pst files to other formats Name: @PACKAGE@ Version: @VERSION@ Release: 1%{?dist} License: GPLv2+ -URL: http://www.five-ten-sg.com/%{name}/ +URL: https://github.com/pst-format/%{name}/ Source: %{url}/packages/%{name}-%{version}.tar.gz BuildRequires: make BuildRequires: libtool gcc-c++ BuildRequires: gd-devel zlib-devel boost-devel libgsf-devel gettext-devel %if 0%{with_dii} BuildRequires: ImageMagick %endif %if 0%{?use_python3} BuildRequires: python3 python3-devel boost-python3 boost-python3-devel Requires: boost-python3 %else BuildRequires: python-devel %endif Requires: libgsf gettext Requires: %{name}-libs%{?_isa} = %{version}-%{release} %if 0%{with_dii} Requires: ImageMagick%{?_isa} %endif %{!?python_sitelib: %global python_sitelib %(%{__python} -c "from distutils.sysconfig import get_python_lib; print get_python_lib()")} %{!?python_sitearch: %global python_sitearch %(%{__python} -c "from distutils.sysconfig import get_python_lib; print get_python_lib(1)")} %if 0%{with_dii} %description The Libpst utilities include readpst which can convert email messages to both mbox and MH mailbox formats, pst2ldif which can convert the contacts to .ldif format for import into ldap databases, and pst2dii which can convert email messages to the DII load file format used by Summation. %else %description The Libpst utilities include readpst which can convert email messages to both mbox and MH mailbox formats, pst2ldif which can convert the contacts to .ldif format for import into ldap databases. %endif %package libs Summary: Shared library used by the pst utilities %description libs The libpst-libs package contains the shared library used by the pst utilities. %if 0%{?use_python3} %package -n python3-%{name} Requires: python3 Provides: %{name}-python = %{version}-%{release} %else %package python Requires: python %endif Summary: Python bindings for libpst Requires: %{name}-libs%{?_isa} = %{version}-%{release} %if 0%{?fedora} >= 20 || 0%{?rhel} >= 9 %global __provides_exclude_from %{?__provides_exclude_from:%__provides_exclude_from|}^%{python_sitearch}/_.*\.so$ %else %{?filter_setup: %filter_provides_in %{python_sitearch}/_.*\.so$ %filter_setup } %endif %if 0%{?use_python3} %description -n python3-%{name} %else %description python %endif The libpst-python package allows you to use the libpst shared object from Python code. %package devel Summary: Library links and header files for libpst application development Requires: pkgconfig Requires: %{name}-libs%{?_isa} = %{version}-%{release} %description devel The libpst-devel package contains the library links and header files you'll need to develop applications using the libpst shared library. You do not need to install it if you just want to use the libpst utilities. %package devel-doc Summary: Documentation for libpst.so for libpst application development Requires: %{name}-doc = %{version}-%{release} %description devel-doc The libpst-devel-doc package contains the doxygen generated documentation for the libpst.so shared library. %package doc Summary: Documentation for the pst utilities in html format %description doc The libpst-doc package contains the html documentation for the pst utilities. You do not need to install it if you just want to use the libpst utilities. %prep %setup -q %build autoreconf -fiv %configure --enable-libpst-shared \ %if 0%{with_dii} --enable-dii \ %else --disable-dii \ %endif --with-boost-python=boost_python%{python3_version_nodots} %if 0%{?use_python3} %make_build %else make %{?_smp_mflags} %endif %install %if 0%{?use_python3} %make_install %else rm -rf $RPM_BUILD_ROOT make DESTDIR=$RPM_BUILD_ROOT install %endif #Remove libtool archives. find %{buildroot} -name '*.la' -or -name '*.a' | xargs rm -f mv %{buildroot}%{_datadir}/doc/%{name}-%{version} %{buildroot}%{_datadir}/doc/%{name} # Remove pst2dii man page, when it's not built %if !0%{with_dii} rm %{buildroot}%{_mandir}/man1/pst2dii.1* %endif %if 0%{?use_python3} %ldconfig_scriptlets libs %else %post libs -p /sbin/ldconfig %postun libs -p /sbin/ldconfig %endif %files %{_bindir}/* %{_mandir}/man1/* %{_mandir}/man5/* %files libs %{_libdir}/libpst.so.* %doc COPYING %if 0%{?use_python3} %files -n python3-%{name} %defattr(-,root,root,-) %{python3_sitearch}/_*.so %else %files python %{python_sitearch}/_*.so %endif %files devel %{_libdir}/libpst.so %{_includedir}/%{name}-@LIBPST_SO_MAJOR@/ %{_libdir}/pkgconfig/libpst.pc %files devel-doc %{_datadir}/doc/%{name}/devel/ %files doc %dir %{_datadir}/doc/%{name}/ %{_datadir}/doc/%{name}/*.html %{_datadir}/doc/%{name}/AUTHORS %{_datadir}/doc/%{name}/COPYING %{_datadir}/doc/%{name}/ChangeLog %{_datadir}/doc/%{name}/NEWS %{_datadir}/doc/%{name}/README %changelog * Sat Mar 27 2021 Carl Byington 0.6.76-1 - Stuart C. Naifeh - fix rfc2231 encoding when saving messages to both .eml and .msg formats. - fix template issue to build with gcc 11 * Tue Feb 02 2021 Milan Crha - 0.6.75-9 - Resolves: #1913613 (Disable DII (and ImageMagic dependency) for RHEL 9) * Tue Jan 26 2021 Fedora Release Engineering - 0.6.75-8 - Rebuilt for https://fedoraproject.org/wiki/Fedora_34_Mass_Rebuild * Fri Jan 22 2021 Jonathan Wakely - 0.6.75-7 - Rebuilt for Boost 1.75 * Tue Jul 28 2020 Merlin Mathesius - 0.6.75-6 - FTBFS fix: %%{__python} must now be explicitly defined * Tue Jul 28 2020 Fedora Release Engineering - 0.6.75-5 - Rebuilt for https://fedoraproject.org/wiki/Fedora_33_Mass_Rebuild * Thu Jul 16 2020 Merlin Mathesius - 0.6.75-4 - Cleanup conditionals for using python3 * Fri May 29 2020 Jonathan Wakely - 0.6.75-3 - Rebuilt for Boost 1.73 * Tue May 26 2020 Miro Hrončok - 0.6.75-2 - Rebuilt for Python 3.9 * Sun Mar 22 2020 Carl Byington 0.6.75-1 - Markus Schnalke - fix from Debian for vcard version format. * Wed Jan 29 2020 Fedora Release Engineering - 0.6.74-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_32_Mass_Rebuild * Sun Jan 12 2020 Carl Byington 0.6.74-1 - Paul Wise - many changes for debian: - Add missing linking with zlib and libpthread/librt - Use PKG_CHECK_MODULES to find the gsf-1 library - Fix usage of indefinite articles - Fix a number of spelling mistakes - Use plain make when building from Mercurial - Add operator and quotes to the AX_PYTHON_DEVEL parameter - Remove files copied in by autotools - Add AM_GNU_GETTEXT macros - Rename configure.in to configure.ac - add extern "C" to header for use with C++ code * Mon Aug 19 2019 Miro Hrončok - 0.6.72-6 - Rebuilt for Python 3.8 * Thu Jul 25 2019 Fedora Release Engineering - 0.6.72-5 - Rebuilt for https://fedoraproject.org/wiki/Fedora_31_Mass_Rebuild * Thu Jul 25 2019 Carl Byington 0.6.73-1 - Tim Dufrane - fix segfault in pst_close() * Sat Jun 08 2019 Leigh Scott - 0.6.72-4 - Add configure option for boost-python - Remove all old fedora conditionals - Update spec file to comply with packaging guidelines * Fri Feb 01 2019 Fedora Release Engineering - 0.6.72-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_30_Mass_Rebuild * Wed Jan 30 2019 Jonathan Wakely - 0.6.72-2 - Rebuilt for Boost 1.69 * Wed Aug 01 2018 Carl Byington 0.6.72-1 - allow all 7 days in bydays recurring appointment - update for Fedora Python packaging - Alfredo Esteban - add -l and -f options to lspst * Fri Jul 13 2018 Fedora Release Engineering - 0.6.71-8 - Rebuilt for https://fedoraproject.org/wiki/Fedora_29_Mass_Rebuild * Wed Feb 07 2018 Fedora Release Engineering - 0.6.71-7 - Rebuilt for https://fedoraproject.org/wiki/Fedora_28_Mass_Rebuild * Sun Aug 20 2017 Zbigniew Jędrzejewski-Szmek - 0.6.71-6 - Add Provides for the old name without %%_isa * Sat Aug 19 2017 Zbigniew Jędrzejewski-Szmek - 0.6.71-5 - Python 2 binary package renamed to python2-libpst See https://fedoraproject.org/wiki/FinalizingFedoraSwitchtoPython3 * Sat Aug 19 2017 Zbigniew Jędrzejewski-Szmek - 0.6.71-4 - Python 2 binary package renamed to python2-libpst See https://fedoraproject.org/wiki/FinalizingFedoraSwitchtoPython3 * Thu Aug 03 2017 Fedora Release Engineering - 0.6.71-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Binutils_Mass_Rebuild * Wed Jul 26 2017 Fedora Release Engineering - 0.6.71-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Mass_Rebuild * Fri Jul 21 2017 Carl Byington 0.6.71-1 - Fedora Python naming scheme changes - Zachary Travis - Add support for the OST 2013 format, and Content-Disposition filename key fix for outlook compatibility * Thu Jul 20 2017 Kalev Lember - 0.6.70-3 - Rebuilt for Boost 1.64 * Fri Jul 07 2017 Igor Gnatenko - 0.6.70-2 - Rebuild due to bug in RPM (RHBZ #1468476) * Wed Feb 08 2017 Carl Byington 0.6.70-1 - Jeffrey Morlan - pst_getID2 must not recurse into children * Fri Jan 27 2017 Jonathan Wakely - 0.6.69-2 - Rebuilt for Boost 1.63 * Sat Oct 29 2016 Carl Byington 0.6.69-1 - fix bugs in code allowing folders containing multiple item types * Mon Aug 29 2016 Carl Byington 0.6.68-1 - allow folders containing multiple item types, e.g. email and calendar - better detection of valid internet headers * Tue Jul 19 2016 Fedora Release Engineering - 0.6.67-2 - https://fedoraproject.org/wiki/Changes/Automatic_Provides_for_Python_RPM_Packages * Wed Jul 06 2016 Carl Byington 0.6.67-1 - Jeffrey Morlan - multiple bug fixes and an optimization * Thu Feb 04 2016 Fedora Release Engineering - 0.6.66-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_24_Mass_Rebuild * Fri Jan 15 2016 Jonathan Wakely - 0.6.66-2 - Rebuilt for Boost 1.60 * Mon Dec 21 2015 Carl Byington 0.6.66-1 - Igor Stroh - Added Content-ID header support * Fri Sep 11 2015 Carl Byington 0.6.65-1 - Jeffrey Morlan - fix multiple Content-Type headers - Hans Liss - debug level output * Thu Aug 27 2015 Jonathan Wakely - 0.6.64-6 - Rebuilt for Boost 1.59 * Wed Jul 29 2015 Fedora Release Engineering - 0.6.64-5 - Rebuilt for https://fedoraproject.org/wiki/Changes/F23Boost159 * Wed Jul 22 2015 David Tardon - 0.6.64-4 - rebuild for Boost 1.58 * Wed Jun 17 2015 Fedora Release Engineering - 0.6.64-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_23_Mass_Rebuild * Sat May 02 2015 Kalev Lember - 0.6.64-2 - Rebuilt for GCC 5 C++11 ABI change * Mon Mar 09 2015 Carl Byington 0.6.64-1 - fix line wrap on Python provides_exclude_from - fix unchecked errors found by cppcheck - AJ Shankar fixes for attachment processing and body encodings that contain embedded null chars. * Mon Jan 26 2015 Petr Machata - 0.6.63-5 - Rebuild for boost 1.57.0 * Sun Aug 17 2014 Fedora Release Engineering - 0.6.63-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild * Sat Jun 07 2014 Fedora Release Engineering - 0.6.63-3 - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_Mass_Rebuild * Fri May 23 2014 David Tardon - 0.6.63-2 - rebuild for boost 1.55.0 * Fri Dec 27 2013 Carl Byington 0.6.63-1 - Daniel Gryniewicz found buffer overrun in LIST_COPY_TIME * Sun Sep 22 2013 Carl Byington 0.6.62-1 - 983596 - Old dependency filter breaks file coloring * Tue Aug 06 2013 Carl Byington 0.6.61-1 - move documentation to unversioned directory * Sat Aug 03 2013 Fedora Release Engineering - 0.6.59-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_20_Mass_Rebuild * Sat Jul 27 2013 pmachata@redhat.com - 0.6.59-3 - Rebuild for boost 1.54.0 * Wed Jun 12 2013 Carl Byington 0.6.60-1 - patch from Dominique Leuenberger to add AC_USE_SYSTEM_EXTENSIONS - add readpst -a option for attachment stripping * Tue Jun 11 2013 Remi Collet - 0.6.59-2 - rebuild for new GD 2.1.0 * Fri May 17 2013 Carl Byington 0.6.59-1 - add autoconf checking for libgsf * Fri Mar 29 2013 Carl Byington 0.6.58-4 - add autoreconf for aarch64 * Sun Feb 10 2013 Denis Arnaud - 0.6.58-3 - Rebuild for Boost-1.53.0 * Sat Feb 09 2013 Denis Arnaud - 0.6.58-2 - Rebuild for Boost-1.53.0 * Fri Dec 28 2012 Carl Byington - 0.6.58-1 - fix From quoting on embedded rfc/822 messages * Wed Dec 26 2012 Carl Byington - 0.6.57-1 - bugzilla 852414, remove unnecessary dependencies * Mon Dec 24 2012 Carl Byington - 0.6.56-1 - filter private provides from rpm - merge -m .msg files code into main branch * Thu Aug 09 2012 Carl Byington - 0.6.55-2 - rebuild for Python * Thu Jul 19 2012 Fedora Release Engineering - 0.6.54-6 - Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild * Tue May 08 2012 Carl Byington - 0.6.55-1 - preserve bcc headers - document -C switch to set default character set - space after colon is not required in header fields * Tue Feb 28 2012 Fedora Release Engineering - 0.6.54-5 - Rebuilt for c++ ABI breakage * Fri Jan 13 2012 Fedora Release Engineering - 0.6.54-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_17_Mass_Rebuild * Sat Dec 24 2011 Carl Byington - 0.6.54-3 - bump versions and prep for Fedora build * Wed Nov 30 2011 Petr Pisar - 0.6.53-3 - Rebuild against boost-1.48 * Mon Nov 14 2011 Carl Byington - 0.6.54-2 - failed to bump version number * Fri Nov 04 2011 Carl Byington - 0.6.54-1 - embedded rfc822 messages might contain rtf encoded bodies * Fri Sep 02 2011 Petr Pisar - 0.6.53-2 - Rebuild against boost-1.47 * Sun Jul 10 2011 Carl Byington - 0.6.53-1 - add Status: header in output - allow fork for parallel processing of individual email folders in separate mode - proper handling of --with-boost-python option * Sun May 22 2011 Carl Byington - 0.6.52-1 - fix dangling freed pointer in embedded rfc822 message processing - allow broken outlook internet header field - it sometimes contains fragments of the message body rather than headers * Sun Apr 17 2011 Carl Byington - 0.6.51-1 - fix for buffer overrun; attachment size from the secondary list of mapi elements overwrote proper size from the primary list of mapi elements. Fedora bugzilla 696263 * Tue Feb 08 2011 Fedora Release Engineering - 0.6.49-4 - Rebuilt for https://fedoraproject.org/wiki/Fedora_15_Mass_Rebuild * Mon Feb 07 2011 Thomas Spura - 0.6.49-3 - rebuild for new boost * Fri Dec 24 2010 Carl Byington - 0.6.50-1 - rfc2047 and rfc2231 encoding for non-ascii headers and attachment filenames. * Wed Sep 29 2010 jkeating - 0.6.49-2 - Rebuilt for gcc bug 634757 * Mon Sep 13 2010 Carl Byington - 0.6.49-1 - fix to ignore embedded objects that are not email messages Fedora bugzilla 633498 * Thu Sep 02 2010 Carl Byington - 0.6.48-1 - fix for broken internet headers from Outlook - fix ax_python.m4 to look for python2.7 - use mboxrd from quoting for output formats with multiple messages per file - use no from quoting for output formats with single message per file * Sat Jul 31 2010 Carl Byington - 0.6.47-6 - rebuild for Python dependencies * Mon Jul 26 2010 David Malcolm - 0.6.47-4 - hack up configure so that it looks for Python 2.7 * Wed Jul 21 2010 David Malcolm - 0.6.47-3 - Rebuilt for https://fedoraproject.org/wiki/Features/Python_2.7/MassRebuild * Wed Jul 07 2010 Carl Byington - 0.6.47-2 - Subpackage Licensing, add COPYING to -libs. - patches from Kenneth Berland for solaris * Fri May 07 2010 Carl Byington - 0.6.47-1 - patches from Kenneth Berland for solaris * Thu Jan 21 2010 Carl Byington - 0.6.46-1 - prefer libpthread over librt for finding sem_init function. * Thu Jan 21 2010 Carl Byington - 0.6.45-2 - rebuild for new boost package * Wed Nov 18 2009 Carl Byington - 0.6.45-1 - patch from Hugo DesRosiers to export categories and notes into vcards. - extend that patch to export categories into vcalendar appointments also. * Sun Sep 20 2009 Carl Byington - 0.6.44-1 - patch from Lee Ayres to add file name extensions in separate mode. - allow mixed items types in a folder in separate mode. * Sat Sep 12 2009 Carl Byington - 0.6.43-1 - decode more of the pst format, some minor bug fixes - add support for code pages 1200 and 1201. - add readpst -t option to select output item types, which can now be used to process folders containing mixed item types. - fix segfault with embedded appointments - add readpst -u option for Thunderbird mode .size and .type files - better detection of embedded rfc822 message attachments * Thu Sep 03 2009 Carl Byington - 0.6.42-1 - patch from Fridrich Strba to build with DJGPP DOS cross-compiler. * Sat Jul 25 2009 Fedora Release Engineering - 0.6.41-2 - Rebuilt for https://fedoraproject.org/wiki/Fedora_12_Mass_Rebuild * Tue Jun 23 2009 Carl Byington - 0.6.41-1 - fix ax_python detection - should not use locate command - checking for Fedora versions is not needed * Tue Jun 23 2009 Carl Byington - 0.6.40-1 - Fedora 11 has python2.6 - remove pdf version of the man pages * Sun Jun 21 2009 Carl Byington - 0.6.39-1 - Fedora > 10 moved to boost-python-devel * Sun Jun 21 2009 Carl Byington - 0.6.38-1 - add Python interface to the shared library. - bump soname to version 4 for many changes to the interface. - better decoding of recurrence data in appointments. - remove readpstlog since debug log files are now plain text. - add readpst -j option for parallel jobs for each folder. - make nested mime multipart/alternative to hold the text/html parts. * Fri Apr 17 2009 Carl Byington - 0.6.37-1 - add pst_attach_to_mem() back into the shared library interface. - fix memory leak caught by valgrind. * Tue Apr 14 2009 Carl Byington - 0.6.36-1 - build separate -doc and -devel-doc subpackages. - other spec file cleanup * Wed Apr 08 2009 Carl Byington - 0.6.35-1 - properly add trailing mime boundary in all modes. - build separate libpst, libpst-libs, libpst-devel rpms. * Thu Mar 19 2009 Carl Byington - 0.6.34-1 - avoid putting mixed item types into the same output folder. * Tue Mar 17 2009 Carl Byington - 0.6.33-1 - compensate for iconv conversion to utf-7 that produces strings that are not null terminated. - don't produce empty attachment files in separate mode. * Sat Mar 14 2009 Carl Byington - 0.6.32-1 - fix ppc64 compile error * Sat Mar 14 2009 Carl Byington - 0.6.31-1 - bump version for Fedora cvs tagging mistake * Sat Mar 14 2009 Carl Byington - 0.6.30-1 - track character set individually for each mapi element. - remove charset option from pst2ldif since we get that from each object now. - avoid emitting bogus empty email messages into contacts and calendar files. * Tue Feb 24 2009 Carl Byington - 0.6.29-1 - fix for 64bit on Fedora 11 * Tue Feb 24 2009 Carl Byington - 0.6.28-1 - improve decoding of multipart/report and message/rfc822 mime types. - improve character set handling. - fix embedded rfc822 messages with attachments. * Sat Feb 07 2009 Carl Byington - 0.6.27-1 - fix for const correctness on Fedora 11 * Sat Feb 07 2009 Carl Byington - 0.6.26-1 - patch from Fridrich Strba for building on mingw and general - cleanup of autoconf files. - add processing for pst files of type 0x0f. - strip and regenerate all MIME headers to avoid duplicates. - do a better job of making unique MIME boundaries. - only use base64 coding when strictly necessary. * Fri Jan 16 2009 Carl Byington - 0.6.25-1 - improve handling of content-type charset values in mime parts * Thu Dec 11 2008 Carl Byington - 0.6.24-1 - patch from Chris Eagle to build on cygwin * Thu Dec 04 2008 Carl Byington - 0.6.23-1 - bump version to avoid cvs tagging mistake in fedora * Fri Nov 28 2008 Carl Byington - 0.6.22-1 - patch from David Cuadrado to process emails with type PST_TYPE_OTHER - base64_encode_multiple() may insert newline, needs larger malloc - subject lines shorter than 2 bytes could segfault * Tue Oct 21 2008 Carl Byington - 0.6.21-1 - fix title bug with old schema in pst2ldif. - also escape commas in distinguished names per rfc4514. * Thu Oct 09 2008 Carl Byington - 0.6.20-1 - add configure option --enable-dii=no to remove dependency on libgd. - many fixes in pst2ldif by Robert Harris. - add -D option to include deleted items, from Justin Greer - fix from Justin Greer to add missing email headers - fix from Justin Greer for my_stristr() - fix for orphan children when building descriptor tree - avoid writing uninitialized data to debug log file - remove unreachable code - create dummy top-of-folder descriptor if needed for corrupt pst files * Sun Sep 14 2008 Carl Byington - 0.6.19-1 - Fix base64 encoding that could create long lines. - Initial work on a .so shared library from Bharath Acharya. * Thu Aug 28 2008 Carl Byington - 0.6.18-1 - Fixes for iconv on Mac from Justin Greer. * Tue Aug 05 2008 Carl Byington - 0.6.17-1 - More fixes for 32/64 bit portability on big endian ppc. * Tue Aug 05 2008 Carl Byington - 0.6.16-1 - Use inttypes.h for portable printing of 64 bit items. * Wed Jul 30 2008 Carl Byington - 0.6.15-1 - Patch from Robert Simpson for file handle leak in error case. - Fix for missing length on lz decompression, bug found by Chris White. * Sun Jun 15 2008 Carl Byington - 0.6.14-1 - Fix my mistake in Debian packaging. * Fri Jun 13 2008 Carl Byington - 0.6.13-1 - Patch from Robert Simpson for encryption type 2. * Tue Jun 10 2008 Carl Byington - 0.6.12-1 - Patch from Joachim Metz for Debian packaging and - fix for incorrect length on lz decompression * Tue Jun 03 2008 Carl Byington - 0.6.11-1 - Use ftello/fseeko to properly handle large files. - Document and properly use datasize field in b5 blocks. - Fix some MSVC compile issues and collect MSVC dependencies into one place. * Thu May 29 2008 Carl Byington - 0.6.10-1 - Patch from Robert Simpson for doubly-linked list code and arrays of unicode strings. * Fri May 16 2008 Carl Byington - 0.6.9 - Patch from Joachim Metz for 64 bit compile. - Fix pst format documentation for 8 byte backpointers. * Wed Mar 05 2008 Carl Byington - 0.6.8 - Initial version of pst2dii to convert to Summation dii load file format - changes for Fedora packaging guidelines (#434727) * Tue Jul 10 2007 Carl Byington - 0.5.5 - merge changes from Joe Nahmias version * Sun Feb 19 2006 Carl Byington - 0.5.3 - initial spec file using autoconf and http://www.fedora.us/docs/rpm-packaging-guidelines.html diff --git a/xml/libpst.in b/xml/libpst.in index acdf08d..42248f0 100644 --- a/xml/libpst.in +++ b/xml/libpst.in @@ -1,2055 +1,2055 @@ @PACKAGE@ Utilities - Version @VERSION@ Packages The various source and binary packages are available at http://www.five-ten-sg.com/@PACKAGE@/packages/. The most recent documentation is available at http://www.five-ten-sg.com/@PACKAGE@/. The most recent developer documentation for the shared library is available at http://www.five-ten-sg.com/@PACKAGE@/devel/. A Git source code repository for this project is available at http://hg.five-ten-sg.com/@PACKAGE@/. + url="https://github.com/pst-format/PACKAGE@.git">https://github.com/pst-format/PACKAGE@.git. This version can now convert both 32 bit Outlook files (pre 2003), and the 64 bit Outlook 2003 pst files. Utilities are supplied to convert email messages to both mbox and MH mailbox formats, and to DII load file format for use with many of the CT Summation products. Contacts can be converted to a simple list, to vcard format, or to ldif format for import to an LDAP server. The libpff project has some excellent documentation of the pst file format. 2017-12-07 readpst 1 readpst @VERSION@ readpst convert PST (MS Outlook Personal Folders) files to mbox and other formats Synopsis readpst pstfile Description readpst is a program that can read an Outlook PST (Personal Folders) file and convert it into an mbox file, a format suitable for KMail, a recursive mbox structure, or separate emails. Options -C default-charset Set the character set to be used for items with an unspecified character set. -D Include deleted items in the output. -M Output messages in MH (rfc822) format as separate files. This will create folders as named in the PST file, and will put each email together with any attachments into its own file. These files will be numbered from 1 to n with no leading zeros. This format has no from quoting. -S Output messages into separate files. This will create folders as named in the PST file, and will put each email in its own file. These files will be numbered from 1 to n with no leading zeros. Attachments will also be saved in the same folder as the email message. The attachments for message $m are saved as $m-$name where $name is (the original name of the attachment, or 'attach$n' if the attachment had no name), where $n is another sequential index with no leading zeros. This format has no from quoting. -V Show program version and exit. -a attachment-extension-list Set the list of acceptable attachment extensions. Any attachment that does not have an extension on this list will be discarded. All attachments are acceptable if the list is empty, or this option is not specified. -b Do not save the attachments for the RTF format of the email body. -c format Set the Contact output mode. Use -cv for vcard format or -cl for an email list. -d debug-file Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. -e Same as the M option, but each output file will include an extension from (.eml, .ics, .vcf). This format has no from quoting. -h Show summary of options and exit. -j jobs Specifies the maximum number of parallel jobs. Specify 0 to suppress running parallel jobs. Folders may be processed in parallel. Output formats that place each mail message in a separate file (-M, -S, -e) may process the contents of individual folders in parallel. -k Changes the output format to KMail. This format uses mboxrd from quoting. -m Same as the e option, but write .msg files also -o output-directory Specifies the output directory. The directory must already exist, and is entered after the PST file is opened, but before any processing of files commences. -q Changes to silent mode. No feedback is printed to the screen, except for error messages. -r Changes the output format to Recursive. This will create folders as named in the PST file, and will put all emails in a file called "mbox" inside each folder. Appointments go into a file called "calendar", address book entries go into a file called "contacts", and journal entries go into a file called "journal". These files are then compatible with all mbox-compatible email clients. This format uses mboxrd from quoting. -t output-type-codes Specifies the item types that are processed. The argument is a sequence of single letters from (e,a,j,c) for (email, appointment, journal, contact) types. The default is to process all item types. -u Sets Thunderbird mode, a submode of recursive mode. This causes two extra .type and .size meta files to be created. This format uses mboxrd from quoting. -w Overwrite any previous output files. Beware: When used with the -S switch, this will remove all files from the target folder before writing. This is to keep the count of emails and attachments correct. -8 Output bodies in UTF-8, rather than original encoding, if a UTF-8 version is available. From Quoting Output formats that place each mail message in a separate file (-M, -S, -e, -m) don't do any from quoting. Output formats that place multiple email messages in a single file (-k, -r, -u) now use mboxrd from quoting rules. If none of those switches are specified, the default output format uses mboxrd from quoting rules, since it produces multiple email messages in a single file. Earlier versions used mboxo from quoting rules for all output formats. Author This manual page was originally written by Dave Smith <dave.s@earthcorp.com>, and updated by Joe Nahmias <joe@nahmias.net> for the Debian GNU/Linux system (but may be used by others). It was subsequently updated by Brad Hards <bradh@frogmouth.net>, and converted to xml format by Carl Byington <carl@five-ten-sg.com>. Copyright Copyright (C) 2002 by David Smith <dave.s@earthcorp.com>. XML version Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com>. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. Version @VERSION@ 2016-08-29 lspst 1 lspst @VERSION@ lspst list PST (MS Outlook Personal Folders) file data Synopsis lspst pstfile Options -V Show program version and exit. -d debug-file Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. -f date-format Select the date format for long format listing. Defaults to "%F %T". -l Use long format listing to show the Date, CC and BCC headers. -h Show summary of options and exit. Description lspst is a program that can read an Outlook PST (Personal Folders) file and produce a simple listing of the data (contacts, email subjects, etc). Author lspst was written by Joe Nahmias <joe@nahmias.net> based on readpst. This man page was written by 510 Software Group <carl@five-ten-sg.com>. Copyright Copyright (C) 2004 by Joe Nahmias <joe@nahmias.net>. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. Version @VERSION@ 2017-12-07 pst2ldif 1 pst2ldif @VERSION@ pst2ldif extract contacts from an MS Outlook .pst file in .ldif format Synopsis pst2ldif pstfilename Options -V Show program version. Subsequent options are then ignored. -b ldap-base Sets the ldap base value used in the dn records. You probably want to use something like "o=organization, c=US". -c class Sets the objectClass values for the contact items. This class needs to be defined in the schema used by your LDAP server, and at a minimum it must contain the ldap attributes given below. This option may be specified multiple times to generate entries with multiple object classes. -d debug-file Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. -l extra-line Specify an extra line to be added to each ldap entry. This option may be specified multiple times to add multiple lines to each ldap entry. -o Use the old ldap schema, rather than the default new ldap schema. The old schema generates multiple postalAddress attributes for a single entry. The new schema generates a single postalAddress (and homePostalAddress when available) attribute with $ delimiters as specified in RFC4517. Using the old schema also generates two extra leading entries, one for "dn:ldap base", and one for "dn: cn=root, ldap base". -h Show summary of options. Subsequent options are then ignored. Description pst2ldif reads the contact information from an MS Outlook .pst file and produces a .ldif file that may be used to import those contacts into an LDAP database. The following ldap attributes are generated for the old ldap schema: cn givenName sn personalTitle company mail postalAddress l st postalCode c homePhone telephoneNumber facsimileTelephoneNumber mobile description The following attributes are generated for the new ldap schema: cn givenName sn title o mail postalAddress homePostalAddress l st postalCode c homePhone telephoneNumber facsimileTelephoneNumber mobile description labeledURI Copyright Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. Version @VERSION@ 2017-12-07 pst2dii 1 pst2dii @VERSION@ pst2dii extract email messages from an MS Outlook .pst file in DII load format Synopsis pst2dii -f ttf-font-file pstfilename Options -B bates-prefix Sets the bates prefix string. The bates sequence number is appended to this string, and printed on each page. -O dii-output-file Name of the output DII load file. -V Show program version. Subsequent options are then ignored. -b bates-number Starting bates sequence number. The default is zero. -c bates-color Font color for the bates stamp on each page, specified as 6 hex digits as rrggbb values. The default is ff0000 for bright red. -d debug-file Specify name of debug log file. The log file is now an ascii file, instead of the binary file used in previous versions. -f ttf-font-file Specify name of a true type font file. This should be a fixed pitch font. -h Show summary of options. Subsequent options are then ignored. -o output-directory Specifies the output directory. The directory must already exist. Description pst2dii reads the email messages from an MS Outlook .pst file and produces a DII load file that may be used to import message summaries into a Summation DII system. The DII output file contains references to the image and attachment files in the output directory. Copyright Copyright (C) 2008 by 510 Software Group <carl@five-ten-sg.com> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. You should have received a copy of the GNU General Public License along with this program; see the file COPYING. If not, please write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. Version @VERSION@ 2017-12-07 outlook.pst 5 outlook.pst format of MS Outlook .pst file Synopsis outlook.pst Overview Low level or primitive items in a .pst file are identified by an I_ID value. Higher level or composite items in a .pst file are identified by a D_ID value. There are two separate b-trees indexed by these I_ID and D_ID values. Starting with Outlook 2003, the file format changed from one with 32 bit pointers, to one with 64 bit pointers. We describe both formats here. 32 bit File Header The 32 bit file header is located at offset 0 in the .pst file. We only support index types 0x0e, 0x0f, 0x15, and 0x17, and encryption types 0x00, 0x01 and 0x02. Index type 0x0e is the older 32 bit Outlook format. Index type 0x0f seems to be rare, and so far the data seems to be identical to that in type 0x0e files. Index type 0x17 is the newer 64 bit Outlook format. Index type 0x15 seems to be rare, and according to the libpff project should have the same format as type 0x17 files. It was found in a 64-bit pst file created by Visual Recovery. It may be that index types less than 0x10 are 32 bit, and index types greater than or equal to 0x10 are 64 bit, and the low order four bits of the index type is some subtype or minor version number. Encryption type 0x00 is no encryption, type 0x01 is "compressible" encryption which is a simple substitution cipher, and type 0x02 is "strong" encryption, which is a simple three rotor Enigma cipher from WWII. offsetIndex1 is the file offset of the root of the index1 b-tree, which contains (I_ID, offset, size, unknown) tuples for each item in the file. backPointer1 is the value that should appear in the parent pointer of that root node. offsetIndex2 is the file offset of the root of the index2 b-tree, which contains (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) tuples for each item in the file. backPointer2 is the value that should appear in the parent pointer of that root node. 64 bit File Header The 64 bit file header is located at offset 0 in the .pst file. 32 bit Index 1 Node The 32 bit index1 b-tree nodes are 512 byte blocks with the following format. The itemCount specifies the number of 12 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a triple of (I_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and I_ID is the lowest I_ID value in the subtree. 64 bit Index 1 Node The 64 bit index1 b-tree nodes are 512 byte blocks with the following format. The itemCount specifies the number of 24 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a triple of (I_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and I_ID is the lowest I_ID value in the subtree. 32 bit Index 1 Leaf Node The 32 bit index1 b-tree leaf nodes are 512 byte blocks with the following format. The itemCount specifies the number of 12 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a tuple of (I_ID, offset, size, unknown) The two low order bits of the I_ID value seem to be flags. I have never seen a case with bit zero set. Bit one indicates that the item is not encrypted. Note that references to these I_ID values elsewhere may have the low order bit set (and I don't know what that means), but when we do the search in this tree we need to clear that bit so that we can find the correct item. 64 bit Index 1 Leaf Node The 64 bit index1 b-tree leaf nodes are 512 byte blocks with the following format. The itemCount specifies the number of 24 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a tuple of (I_ID, offset, size, unknown) The two low order bits of the I_ID value seem to be flags. I have never seen a case with bit zero set. Bit one indicates that the item is not encrypted. Note that references to these I_ID values elsewhere may have the low order bit set (and I don't know what that means), but when we do the search in this tree we need to clear that bit so that we can find the correct item. 32 bit Index 2 Node The 32 bit index2 b-tree nodes are 512 byte blocks with the following format. The itemCount specifies the number of 12 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a triple of (D_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and D_ID is the lowest D_ID value in the subtree. 64 bit Index 2 Node The 64 bit index2 b-tree nodes are 512 byte blocks with the following format. The itemCount specifies the number of 24 byte records that are active. The nodeLevel is non-zero for this style of nodes. The leaf nodes have a different format. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a triple of (D_ID, backPointer, offset) where the offset points to the next deeper node in the tree, the backPointer value must match the backPointer in that deeper node, and D_ID is the lowest D_ID value in the subtree. 32 bit Index 2 Leaf Node The 32 bit index2 b-tree leaf nodes are 512 byte blocks with the following format. The itemCount specifies the number of 16 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a tuple of (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) The DESC-I_ID points to the main data for this item (Associated Descriptor Items 0x7cec, 0xbcec, or 0x0101) via the index1 tree. The TREE-I_ID is zero or points to an Associated Tree Item 0x0002 via the index1 tree. The PARENT-D_ID points to the parent of this item in this index2 tree. 64 bit Index 2 Leaf Node The 64 bit index2 b-tree leaf nodes are 512 byte blocks with the following format. The itemCount specifies the number of 32 byte records that are active. The nodeLevel is zero for these leaf nodes. The backPointer must match the backPointer from the triple that pointed to this node. Each item in this node is a tuple of (D_ID, DESC-I_ID, TREE-I_ID, PARENT-D_ID) The DESC-I_ID points to the main data for this item (Associated Descriptor Items 0x7cec, 0xbcec, or 0x0101) via the index1 tree. The TREE-I_ID is zero or points to an Associated Tree Item 0x0002 via the index1 tree. The PARENT-D_ID points to the parent of this item in this index2 tree. 32 bit Associated Tree Item 0x0002 A D_ID value may point to an entry in the index2 tree with a non-zero TREE-I_ID which points to this descriptor block via the index1 tree. It maps local ID2 values (referenced in the main data for the original D_ID item) to I_ID values. This descriptor block contains triples of (ID2, I_ID, CHILD-I_ID) where the local ID2 data can be found via I_ID, and CHILD-I_ID is either zero or it points to another Associated Tree Item via the index1 tree. In the above 32 bit leaf node, we have a tuple of (0x61, 0x02a82c, 0x02a836, 0) 0x02a836 is the I_ID of the associated tree, and we can lookup that I_ID value in the index1 b-tree to find the (offset,size) of the data in the .pst file. 64 bit Associated Tree Item 0x0002 This descriptor block contains a tree that maps local ID2 values to I_ID entries, similar to the 32 bit version described above. Associated Descriptor Item 0xbcec Contains information about the item, which may be email, contact, or other outlook types. In the above leaf node, we have a tuple of (0x21, 0x00e638, 0, 0) 0x00e638 is the I_ID of the associated descriptor, and we can lookup that I_ID value in the index1 b-tree to find the (offset,size) of the data in the .pst file. This descriptor is eventually decoded to a list of MAPI elements. Note the signature of 0xbcec. There are other descriptor block formats with other signatures. Note the indexOffset of 0x013c - starting at that position in the descriptor block, we have an array of two byte integers. The first integer (0x000b) is a (count-1) of the number of overlapping pairs following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) and the last (12th) pair is (0x123, 0x13b). These pairs are (start,end+1) offsets of items in this block. So we have count+2 integers following the count value. Note the b5offset of 0x0020, which is a type that I will call an index reference. Such index references have at least two different forms, and may point to data either in this block, or in some other block. External pointer references have the low order 4 bits all set, and are ID2 values that can be used to fetch data. This value of 0x0020 is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. So far we have only described internal index references where the high order 16 bits are zero. That suffices for single descriptor blocks. But in the case of the type 0x0101 descriptor block, we have an array of subblocks. In this case, the high order 16 bits of an internal index reference are used to select the subblock. Each subblock starts with a 16 bit indexOffset which points to the count and array of 16 bit integer pairs which are offsets in the current subblock. Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: Note the descoffset of 0x0040, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0x7c) pair. The datasize (6) plus the b5 code (02) gives the size of the entries, in this case 8 bytes. We now have the offset 0x14 of the descriptor array, composed of 8 byte entries that describe MAPI elements. Each descriptor entry has the following format: For some reference types (2, 3, 0xb) the value is used directly. Otherwise, the value is an index reference, which is either an ID2 value, or an offset, to be right shifted by 4 bits and used to fetch a pair from the index table to find the offset and size of the item in this descriptor block. The following reference types are known, but not all of these are implemented in the code yet. The following item types are known, but not all of these are implemented in the code yet. Associated Descriptor Item 0x7cec This style of descriptor block is similar to the 0xbcec format. This descriptor is also eventually decoded to a list of MAPI elements. Note the signature of 0x7cec. There are other descriptor block formats with other signatures. Note the indexOffset of 0x017a - starting at that position in the descriptor block, we have an array of two byte integers. The first integer (0x0006) is a (count-1) of the number of overlapping pairs following the count. The first pair is (0, 0xc), the next pair is (0xc, 0x14) and the last (7th) pair is (0x160, 0x179). These pairs are (start,end+1) offsets of items in this block. So we have count+2 integers following the count value. Note the 7coffset of 0x0040, which is an index reference. In this case, it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0004, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0x14, 0xea) pair. We have the offset and size of the "7c" block located at offset 0x14 with a size of 214 bytes in this case. The "7c" block starts with a header with the following format: Note the b5Offset of 0x0020, which is an index reference. In this case, it is an internal reference pointer, which needs to be right shifted by 4 bits to become 0x0002, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xc, 0x14) pair. Finally, we have the offset and size of the "b5" block located at offset 0xc with a size of 8 bytes in this descriptor block. The "b5" block has the following format: Note the descoffset of 0x0060, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0006, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xea, 0xf0) pair. The datasize (2) plus the b5 code (04) gives the size of the entries, in this case 6 bytes. We now have the offset 0xea of an unused block of data in an unknown format, composed of 6 byte entries. That gives us (0xf0 - 0xea)/6 = 1, so we have a recordCount of one. We have seen cases where the descoffset in the b5 block is zero, and the index2Offset in the 7c block is zero. This has been seen for objects that seem to be attachments on messages that have been read. Before the message was read, it did not have any attachments. Note the index2Offset above of 0x0080, which again is an index reference. In this case, it is an internal pointer reference, which needs to be right shifted by 4 bits to become 0x0008, which is then a byte offset to be added to the above indexOffset plus two (to skip the count), so it points to the (0xf0, 0x155) pair. This is an array of tables of four byte integers. We will call these the IND2 tables. The size of each of these tables is specified by the recordSize field of the "7c" header. The number of these tables is the above recordCount value derived from the "b5" block. Now the remaining data in the "7c" block after the header starts at offset 0x2a. There should be itemCount 8 byte items here, with the following format: The ind2Offset is a byte offset into the current IND2 table of some value. If that is a four byte integer value, then once we fetch that, we have the same triple (item type, reference type, value) as we find in the 0xbcec style descriptor blocks. If not, then this value is used directly. These 8 byte descriptors are processed recordCount times, each time using the next IND2 table. The item and reference types are as described above for the 0xbcec format descriptor block. 32 bit Associated Descriptor Item 0x0101 This descriptor block contains a list of I_ID values. It is used when an I_ID (that would normally point to a type 0x7cec or 0xbcec descriptor block) contains more data than can fit in any single descriptor of those types. In this case, it points to a type 0x0101 block, which contains a list of I_ID values that themselves point to the actual descriptor blocks. The total length value in the 0x0101 header is the sum of the lengths of the blocks pointed to by the list of I_ID values. The result is an array of subblocks, that may contain index references where the high order 16 bits specify which descriptor subblock to use. Only the first descriptor subblock contains the signature (0xbcec or 0x7cec). 64 bit Associated Descriptor Item 0x0101 This descriptor block contains a list of I_ID values, similar to the 32 bit version described above.