Skip to content

Releases: TeamMsgExtractor/msg-extractor

Version 0.41.4

11 Jun 16:42
9138590
Compare
Choose a tag to compare

v0.41.4

  • Fixed an issue in the last version that would break the decoding function if the contents were not encoded.
  • Updated tzlocal and allow future updates for compressed_rtf and ebcdic.

Version 0.41.3

11 Jun 02:27
ac77e62
Compare
Choose a tag to compare

v0.41.3

  • [TeamMsgExtractor #365] Fixed an issue that would cause certain values retrieved from the header to not be decoded properly. It does this when retrieving the values, so nothing about the header has been changed.
  • Added new property MessageBase.headerText which is the text content of the header stream. Adjusted other things to use this instead of trying to retrieve the stream directly in multiple places.
  • Added typing to MessageBase.header.

Version 0.41.2

25 May 02:00
e8403b7
Compare
Choose a tag to compare

v0.41.2

  • Updated annotations on MessageBase.save.
  • Added new enum BodyTypes.
  • Added property MessageBase.detectedBodies for detecting what bodies have been stored (not generated by the module) in the .msg file.

Version 0.41.1

10 May 17:24
7e93f7b
Compare
Choose a tag to compare

v0.41.1

  • [TeamMsgExtractor #362] Fixed an issue with the removal of the --dev option missing one of the checks (I swear I actually tested it).

Version 0.41.0

09 May 19:53
1837ce2
Compare
Choose a tag to compare

v0.41.0

  • [TeamMsgExtractor #357] Fixed an issue where the properties stream being absent would raise an error message that was not clear.
  • [TeamMsgExtractor #357] Added a way to suppress StandardViolationError for poorly created files. This may cause issues you might not expect since this exception is meant to stop the processing for a reason.
  • [TeamMsgExtractor #223] Finally got around to dealing with signed attachments that are embedded MSG files. SignedAttachment.data now returns either bytes or MSGFile. SignedAttachment also now has a asBytes property that will return the bytes that created the signed attachment, regardless of if it is an MSG or not, making it unnecessary to call MSGFile.exportBytes to get the bytes of the embedded MSG file, which can add a small delay to your code. As Attachment is a much more complex class, it does not (at least yet) have this property. Also unlike Attachment, SignedAttachment will not throw an exception if the data is an MSG file but is not supported. Instead, it will simply be logged as a exception, but the code will continue. If the data is successfully read as an embedded MSG file, the AttachmentType will be AttachmentType.SIGNED_EMBEDDED.
  • Deprecated AttachErrorBehavior in favor of the new ErrorBehavior enum which controls the error behavior for the various parts of the MSG file. Uses of the former will work until the next major version.
  • Deprecated the attachmentErrorBehavior parameter of MSGFile in favor of errorBehavior. Uses of the previous will work until the next major version.
  • Added treePath property to SignedAttachment to bring it more in line with AttachmentBase.
  • Fixed bug in rare logging message caused by incorrect type name.
  • Removed the dev and dev_classes submodules, as most of their features are possible using the base code. Additionally, the classes involved because significantly outdated over time.
  • Removed the --dev argument from the command line.
  • Changed the message for the standards violation error when an attachment has no specified attachment type and it could not be determined automatically. The error now specifies the path of the attachment that had an issue for easier debugging. Additionally, the log message for it simply not being present has also been changed to show this information.
  • Added a dev level log that will output the entire properties mapping if the attachment type is not set. Dev level is 5.
  • Fixed a critical error in Properties that caused the __contains__ method to always be False. This occurred because it was missing a return statement. Fortunately, it looks like only one part of the module was affected due to other parts using a properly written function.
  • Removed some debug prints that slipped through.
  • Changed some parts to use in for checking that a property exists as opposed to has_key. The function was there to act more like Python 2.
  • Deprecated Properties.has_key.
  • Removed the validation submodule and all related references. It was pretty outdated and has minimal usage at this point in time. It may come back at some later point.
  • Changed behavior of MessageBase.save so it doesn't save raw when an exception occurs. This behavior may have ended up creating unexpected output which is why it was removed. It was mainly there for debugging in the first place, but is no longer necessary.
  • Added __contains__ function to Named class.
  • Fixed missing import in message_base.py that would only cause problems if something was wrong with the HTML.
  • Fixed a bad error message appearing when trying to add or use an entry in OleWriter using an empty path.
  • Changed the InvalidFileFormatError for a missing property stream to a StandardViolationError.
  • Changed the exception messages for a few exceptions to fix typos and clarify.
  • Fixed issues in _rtf.tokenize_rtf which would cause an exception to handle incorrectly and throw an unclear error.
  • Fixed various small bugs caused by typos.
  • Clean up unneeded imports.
  • Improved existing __all__ entries and added some where they should be.
  • Removed default export of the UnrecognizedMSGTypeError exception in favor of exporting the exceptions module.
  • Removed default export of properHex.
  • Improved type checking in many places.
  • Fixed issues in RecurrencePattern.

Version 0.40.0: Better RTF Handling

18 Mar 23:28
931025c
Compare
Choose a tag to compare
Pre-release

v0.40.0

  • [TeamMsgExtractor #338] Added new code to handle injection of text into the RTF body. For many cases, this will be much more effective as it relies on ensuring that it is in the main group and past the header before injection. It is not currently the first choice as it doesn't have proper respect for encapsulated HTML, however it will replace some of the old methods entirely. Solving this issue was done through the use of a few functions and the internal _rtf module. This module in it's entirety is considered to be implementation details, and I give no guarantee that it will remain in it's current state even across patch versions. As such, it is not recommended to use it outside of the module.
  • Changed MessageBase.rtfEncapInjectableHeader and MessageBase.rtfPlainInjectableHeader from str to bytes. They always get encoded anyways, so I don't know why I had them returning as str.
  • Updated minimum Python version to 3.8 as 3.6 has reached end of support and 3.7 will reach end of support within the year.
  • Updated information in README.

Version 0.39.2

27 Feb 04:31
569176a
Compare
Choose a tag to compare
Version 0.39.2 Pre-release
Pre-release

v0.39.2

  • Fixed issues with AttachmentBase.name that could cause it to generate wrong.
  • Added convenience function MSGFile.exportBytes which returns the exported version from MSGFile.export as bytes instead of writing it to a file or file-like object.

Version 0.39.1

12 Feb 23:15
b6460e0
Compare
Choose a tag to compare
Version 0.39.1 Pre-release
Pre-release

v0.39.1

  • [TeamMsgExtractor #333] Fixed typo in a warning.
  • [TeamMsgExtractor #334] Removed __del__ method from MSGFile. It was there for cleanup, but wasn't planned well enough to stop it from causing issues. It may be reintroduced in the future if I can manage to remove the issues.
  • [TeamMsgExtractor #335] Fixed some parts of extract_msg.utils.getCommandArgs having invalid logic after a previous (rather old) update that caused exceptions when using certain options.
  • Added new property treePath to AttachmentBase and MSGFile (which adds it to nearly every class). This property is the path to the current instance, represented as a tuple of instances that would be used to get to the current instance.
  • Added sphinx documentation.
  • Fixed an issue in OleWriter that would produce corrupted OLE files if the DIFAT needed more than the header.

Version 0.39.0: Advanced OLE writing

18 Jan 03:49
ab7618c
Compare
Choose a tag to compare
Pre-release

This release fixed several issues while also significantly increasing the functionality of the OleWriter class.

v0.39.0

  • [TeamMsgExtractor #318] Added code to handle a standards violation (from what I can tell, anyways) caused by the attachment not having an AttachMethod property. The code will log a warning, attempt to detect the method, and throw a StandardViolationError if it fails.
  • [TeamMsgExtractor #320] Changed the way string named properties are handled to allow for the string stream to have some errors and still be parsed. Warnings about these errors will be logged.
  • [TeamMsgExtractor #324] Fixed an issues with contact saving when a list property returns None.
  • [TeamMsgExtractor #326] Fixed a bug that could cause some files to error when exporting.
  • Fixed an issue where creation and modification times were not being copied to the new OLE file created by OleWriter.
  • Fixed up a few docstrings.
  • Fixed a few issues in MSGFile regarding the filename keyword argument.
  • Added new argument rootPath to OleWriter.fromOleFile for saving a specific directory from an OLE file instead of just copying the entire file. That directory will become the root of the new one.
  • Adjusted code for OleWriter to generate certain values only at save time to make them more dynamic. This allows for existing streams to be properly edited (although has issues with allowing storages to be edited).
  • Added new function OleWriter.deleteEntry to remove an entry that was already added. If the entry is a storage, all children will be removed too.
  • Added new function OleWriter.editEntry to edit an entry that was already added.
  • Added new function OleWriter.addEntry to add a new entry to the writer without an OleFileIO instance. Properties of the entry are instead set using the same keyword arguments as described in OleWriter.editEntry.
  • Changed _DirectoryEntry to DirectoryEntry to make the more finalized version public. Access to the originals that the OleWriter class creates should never happen, instead copies should be returned to ensure the behavior is as expected.
  • Added new function OleWriter.getEntry which returns a copy of the DirectoryEntry instance for that stream or storage in the writer. Use this function to see the current internal state of an entry.
  • Added new function OleWriter.renameEntry which allows the user to rename a stream or storage (in place). This only changes it's direct name and not it's location in the new OLE file.
  • Added new function OleWriter.walk which is similar to os.walk but for walking the structure of the new OLE file.
  • Added new function OleWriter.listItems which is functionally equivalent to olefile.OleFileIO.listdir which returns a list of paths to every item. Optionally a user can get the paths just for streams, just for storages, or both. Requesting neither will simply return an empty list. Default is to just return streams.
  • Added a small amount of path validation to inputToMsgPath which is used in a lot of places where user input for a path is accepted. It ensures illegal characters don't exist and that the path segments (each name for a storage or stream) are less than 32 characters. This will be most helpful for OleWriter.
  • Added many internal helper functions to OleWriter to make extensions easier and consolidate common code. Many of these involve direct access to internal data which is why they are private.

Version 0.38.4

03 Dec 09:08
b43dcf9
Compare
Choose a tag to compare
Version 0.38.4 Pre-release
Pre-release

v0.38.4

  • Fix line in OleWriter that was causing exporting to fail.
  • Fixed some issues with the README.