Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use hash in changelog filenames #996

Merged
merged 4 commits into from
Nov 6, 2024

Conversation

ecraig12345
Copy link
Member

@ecraig12345 ecraig12345 commented Oct 4, 2024

Add an option changelog.uniqueFilenames which adds an 8-character random suffix to the end of the changelog filename (before the extension): e.g. CHANGELOG-abcd1234.md.

EDIT: After discussion, the suffix is now the first 8 characters of the MD5 hash digest of the package name.

This will stay stable between commits, and any existing changelog files will be renamed when modified. Existing suffixed files should also be renamed if the package is renamed. (I'm guessing 8 characters is good enough, but we could also use 12 or more.)

This is one option for working around an issue with Git: its default hash algorithm only considers the last 16 characters of filenames, which can lead to collisions and inefficient packing when many files have similar names, which then leads to extreme growth in size for larger repos with many colliding changelog files.

Bonus: while adding the hash, I learned we can use crypto.randomUUID() for change file IDs and remove the uuid dependency.

(I also updated a couple things with test configuration: disable type checking in ts-jest since it's redundant and slow, and set launch.json to use Node 14 like the rest of the repo.)

Related to #978

Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I'm not an expert in this codebase or Javascript, so take all of my comments with a grain of salt. I tried to focus on the highest level of functionality, which I believe your implementation will suffice. I prod in a couple of directions for your consideration, but do not consider them blockers.

src/changelog/prepareChangelogPaths.ts Outdated Show resolved Hide resolved
src/changelog/prepareChangelogPaths.ts Outdated Show resolved Hide resolved
@ecraig12345 ecraig12345 changed the title Add option to use unique suffix in changelog filenames Add option to use hash in changelog filenames Oct 8, 2024
if (existingSuffixedPaths[ext]) {
moveIfNeeded({ oldPath: existingSuffixedPaths[ext]!, newPath: defaultPath, hadSuffix: true });
// Generate a unique filename based on the package name hash.
const hash = crypto.createHash('md5').update(packageName).digest('hex').slice(0, 8);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this change. This will avoid the possibility of some corner cases with the random method. There may even be some benefits if a package is moved within the repo (without renaming the package).

let newestDate = 0;
let newestFile: string | undefined;

for (const file of fs.readdirSync(cwd)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best practice would be to pass the { withFileTypes: true } option and test that the entry is a file.

Comment on lines +100 to +101
if (fs.existsSync(oldPath)) {
fs.renameSync(oldPath, newPath);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existence checking files before operating on them is generally an anti-pattern that is advised against by the authors of the file system APIs, since in theory the existence can change between the calls, so you have to handle the non-existence during the real call anyway. Simpler to just handle the ENOENT or ENOTDIR during the renameSync call.

@ecraig12345 ecraig12345 merged commit e349314 into master Nov 6, 2024
4 checks passed
@ecraig12345 ecraig12345 deleted the ecraig/changelog-hash branch November 6, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants