Releases: TeamMsgExtractor/msg-extractor
Releases · TeamMsgExtractor/msg-extractor
Version 0.41.4
v0.41.4
- Fixed an issue in the last version that would break the decoding function if the contents were not encoded.
- Updated
tzlocal
and allow future updates forcompressed_rtf
andebcdic
.
Version 0.41.3
v0.41.3
- [TeamMsgExtractor #365] Fixed an issue that would cause certain values retrieved from the header to not be decoded properly. It does this when retrieving the values, so nothing about the header has been changed.
- Added new property
MessageBase.headerText
which is the text content of the header stream. Adjusted other things to use this instead of trying to retrieve the stream directly in multiple places. - Added typing to
MessageBase.header
.
Version 0.41.2
v0.41.2
- Updated annotations on
MessageBase.save
. - Added new enum
BodyTypes
. - Added property
MessageBase.detectedBodies
for detecting what bodies have been stored (not generated by the module) in the .msg file.
Version 0.41.1
v0.41.1
- [TeamMsgExtractor #362] Fixed an issue with the removal of the
--dev
option missing one of the checks (I swear I actually tested it).
Version 0.41.0
v0.41.0
- [TeamMsgExtractor #357] Fixed an issue where the properties stream being absent would raise an error message that was not clear.
- [TeamMsgExtractor #357] Added a way to suppress
StandardViolationError
for poorly created files. This may cause issues you might not expect since this exception is meant to stop the processing for a reason. - [TeamMsgExtractor #223] Finally got around to dealing with signed attachments that are embedded MSG files.
SignedAttachment.data
now returns eitherbytes
orMSGFile
.SignedAttachment
also now has aasBytes
property that will return the bytes that created the signed attachment, regardless of if it is an MSG or not, making it unnecessary to callMSGFile.exportBytes
to get the bytes of the embedded MSG file, which can add a small delay to your code. AsAttachment
is a much more complex class, it does not (at least yet) have this property. Also unlikeAttachment
,SignedAttachment
will not throw an exception if the data is an MSG file but is not supported. Instead, it will simply be logged as a exception, but the code will continue. If the data is successfully read as an embedded MSG file, theAttachmentType
will beAttachmentType.SIGNED_EMBEDDED
. - Deprecated
AttachErrorBehavior
in favor of the newErrorBehavior
enum which controls the error behavior for the various parts of the MSG file. Uses of the former will work until the next major version. - Deprecated the
attachmentErrorBehavior
parameter ofMSGFile
in favor oferrorBehavior
. Uses of the previous will work until the next major version. - Added
treePath
property toSignedAttachment
to bring it more in line withAttachmentBase
. - Fixed bug in rare logging message caused by incorrect type name.
- Removed the
dev
anddev_classes
submodules, as most of their features are possible using the base code. Additionally, the classes involved because significantly outdated over time. - Removed the
--dev
argument from the command line. - Changed the message for the standards violation error when an attachment has no specified attachment type and it could not be determined automatically. The error now specifies the path of the attachment that had an issue for easier debugging. Additionally, the log message for it simply not being present has also been changed to show this information.
- Added a dev level log that will output the entire properties mapping if the attachment type is not set. Dev level is 5.
- Fixed a critical error in
Properties
that caused the__contains__
method to always beFalse
. This occurred because it was missing areturn
statement. Fortunately, it looks like only one part of the module was affected due to other parts using a properly written function. - Removed some debug prints that slipped through.
- Changed some parts to use
in
for checking that a property exists as opposed tohas_key
. The function was there to act more like Python 2. - Deprecated
Properties.has_key
. - Removed the
validation
submodule and all related references. It was pretty outdated and has minimal usage at this point in time. It may come back at some later point. - Changed behavior of
MessageBase.save
so it doesn't save raw when an exception occurs. This behavior may have ended up creating unexpected output which is why it was removed. It was mainly there for debugging in the first place, but is no longer necessary. - Added
__contains__
function toNamed
class. - Fixed missing import in
message_base.py
that would only cause problems if something was wrong with the HTML. - Fixed a bad error message appearing when trying to add or use an entry in
OleWriter
using an empty path. - Changed the
InvalidFileFormatError
for a missing property stream to aStandardViolationError
. - Changed the exception messages for a few exceptions to fix typos and clarify.
- Fixed issues in
_rtf.tokenize_rtf
which would cause an exception to handle incorrectly and throw an unclear error. - Fixed various small bugs caused by typos.
- Clean up unneeded imports.
- Improved existing
__all__
entries and added some where they should be. - Removed default export of the
UnrecognizedMSGTypeError
exception in favor of exporting the exceptions module. - Removed default export of
properHex
. - Improved type checking in many places.
- Fixed issues in
RecurrencePattern
.
Version 0.40.0: Better RTF Handling
v0.40.0
- [TeamMsgExtractor #338] Added new code to handle injection of text into the RTF body. For many cases, this will be much more effective as it relies on ensuring that it is in the main group and past the header before injection. It is not currently the first choice as it doesn't have proper respect for encapsulated HTML, however it will replace some of the old methods entirely. Solving this issue was done through the use of a few functions and the internal
_rtf
module. This module in it's entirety is considered to be implementation details, and I give no guarantee that it will remain in it's current state even across patch versions. As such, it is not recommended to use it outside of the module. - Changed
MessageBase.rtfEncapInjectableHeader
andMessageBase.rtfPlainInjectableHeader
fromstr
tobytes
. They always get encoded anyways, so I don't know why I had them returning asstr
. - Updated minimum Python version to 3.8 as 3.6 has reached end of support and 3.7 will reach end of support within the year.
- Updated information in
README
.
Version 0.39.2
v0.39.2
- Fixed issues with
AttachmentBase.name
that could cause it to generate wrong. - Added convenience function
MSGFile.exportBytes
which returns the exported version fromMSGFile.export
as bytes instead of writing it to a file or file-like object.
Version 0.39.1
v0.39.1
- [TeamMsgExtractor #333] Fixed typo in a warning.
- [TeamMsgExtractor #334] Removed
__del__
method fromMSGFile
. It was there for cleanup, but wasn't planned well enough to stop it from causing issues. It may be reintroduced in the future if I can manage to remove the issues. - [TeamMsgExtractor #335] Fixed some parts of
extract_msg.utils.getCommandArgs
having invalid logic after a previous (rather old) update that caused exceptions when using certain options. - Added new property
treePath
toAttachmentBase
andMSGFile
(which adds it to nearly every class). This property is the path to the current instance, represented as a tuple of instances that would be used to get to the current instance. - Added sphinx documentation.
- Fixed an issue in
OleWriter
that would produce corrupted OLE files if the DIFAT needed more than the header.
Version 0.39.0: Advanced OLE writing
This release fixed several issues while also significantly increasing the functionality of the OleWriter
class.
v0.39.0
- [TeamMsgExtractor #318] Added code to handle a standards violation (from what I can tell, anyways) caused by the attachment not having an
AttachMethod
property. The code will log a warning, attempt to detect the method, and throw aStandardViolationError
if it fails. - [TeamMsgExtractor #320] Changed the way string named properties are handled to allow for the string stream to have some errors and still be parsed. Warnings about these errors will be logged.
- [TeamMsgExtractor #324] Fixed an issues with contact saving when a list property returns
None
. - [TeamMsgExtractor #326] Fixed a bug that could cause some files to error when exporting.
- Fixed an issue where creation and modification times were not being copied to the new OLE file created by
OleWriter
. - Fixed up a few docstrings.
- Fixed a few issues in
MSGFile
regarding thefilename
keyword argument. - Added new argument
rootPath
toOleWriter.fromOleFile
for saving a specific directory from an OLE file instead of just copying the entire file. That directory will become the root of the new one. - Adjusted code for
OleWriter
to generate certain values only at save time to make them more dynamic. This allows for existing streams to be properly edited (although has issues with allowing storages to be edited). - Added new function
OleWriter.deleteEntry
to remove an entry that was already added. If the entry is a storage, all children will be removed too. - Added new function
OleWriter.editEntry
to edit an entry that was already added. - Added new function
OleWriter.addEntry
to add a new entry to the writer without anOleFileIO
instance. Properties of the entry are instead set using the same keyword arguments as described inOleWriter.editEntry
. - Changed
_DirectoryEntry
toDirectoryEntry
to make the more finalized version public. Access to the originals that theOleWriter
class creates should never happen, instead copies should be returned to ensure the behavior is as expected. - Added new function
OleWriter.getEntry
which returns a copy of theDirectoryEntry
instance for that stream or storage in the writer. Use this function to see the current internal state of an entry. - Added new function
OleWriter.renameEntry
which allows the user to rename a stream or storage (in place). This only changes it's direct name and not it's location in the new OLE file. - Added new function
OleWriter.walk
which is similar toos.walk
but for walking the structure of the new OLE file. - Added new function
OleWriter.listItems
which is functionally equivalent toolefile.OleFileIO.listdir
which returns a list of paths to every item. Optionally a user can get the paths just for streams, just for storages, or both. Requesting neither will simply return an empty list. Default is to just return streams. - Added a small amount of path validation to
inputToMsgPath
which is used in a lot of places where user input for a path is accepted. It ensures illegal characters don't exist and that the path segments (each name for a storage or stream) are less than 32 characters. This will be most helpful forOleWriter
. - Added many internal helper functions to
OleWriter
to make extensions easier and consolidate common code. Many of these involve direct access to internal data which is why they are private.
Version 0.38.4
v0.38.4
- Fix line in
OleWriter
that was causing exporting to fail. - Fixed some issues with the
README
.