Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Filter To Persist For Warm Boot #28

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,48 @@ var node = new Awk('docs', 'ES6', 'ECMAScript 2015');
module.exports = node;
```

## Persistent Cache

__Note: This feature is experimental and is only available on Unix based systems.__

Adding persist flag allows a subclass to persist state across restarts. This exists to mitigate the upfront cost of some more expensive transforms on warm boot. __It does not aim to improve incremental build performance, if it does, it should indicate something is wrong with the filter or input filter in question.__

### How does it work?

It does so but establishing a 2 layer file cache. The first layer, is the entire bucket.
The second, `cacheKeyProcessString` is a per file cache key.

Together, these two layers should provide the right balance of speed and sensibility.

The bucket level cacheKey must be stable but also never become stale. If the key is not
stable, state between restarts will be lost and performance will suffer. On the flip-side,
if the cacheKey becomes stale changes may not be correctly reflected.

It is configured by subclassing and refining `cacheKey` method. A good key here, is
likely the name of the plugin, its version and the actual versions of its dependencies.

```js
Subclass.prototype.cacheKey = function() {
return md5(Filter.prototype.call(this) + inputOptionsChecksum + dependencyVersionChecksum);
}
```

The second key, represents the contents of the file. Typically the base-class's functionality
is sufficient, as it merely generates a checksum of the file contents. If for some reason this
is not sufficient, it can be re-configured via subclassing.

```js
Subclass.prototype.cacheKeyProcessString = function(string, relativePath) {
return superAwesomeDigest(string);
}
```

It is recommended that persistent re-builds is opt-in by the consumer as it does not currently work on all systems.

```js
var myTree = new SomePlugin('lib', { persist: true });
```

## FAQ

### Upgrading from 0.1.x to 1.x
Expand Down
80 changes: 68 additions & 12 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ var copyDereferenceSync = require('copy-dereference').sync;
var Cache = require('./lib/cache');
var debugGenerator = require('debug');
var keyForFile = require('./lib/key-for-file');
var PersistentCache = require('async-disk-cache');
var hashForDep = require('hash-for-dep');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • should confirm this works as expected given a npm v3 dedupe.
  • should add timing logging to hash-for-dep – in some scenarios it could be a boot time perf issue.

var md5Hex = require('md5-hex');

module.exports = Filter;

Expand Down Expand Up @@ -43,6 +46,16 @@ function Filter(inputTree, options) {
this.inputEncoding = options.inputEncoding;
if (options.outputEncoding != null)
this.outputEncoding = options.outputEncoding;
if (options.persist) {
if (/^win/.test(process.platform)) {
console.log('Unfortunately persistent cache is currently not available on windows based systems.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should link to a tracking issue from this warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} else {
this.persistent = options.persist;
this._peristentCache = new PersistentCache(this.cacheKey(), {
compression: 'deflate'
});
}
}
}

this._cache = new Cache();
Expand Down Expand Up @@ -75,6 +88,36 @@ Filter.prototype.build = function build() {
});
};

/*
* @private
*
*
* @method cachKey
* @return {String} this filters top-level cache key
*/
Filter.prototype.cacheKey = function() {
return hashForDep(this.baseDir());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is fairly costly, and we easily have N independent broccoli transforms. I believe this only needs to run once per subclass, not once per instance. As such I believe caching the result on the constructor should be sufficient.

};

/* @public
*
* @method baseDir
* @returns {String} absolute path to the root of the filter...
*/
Filter.prototype.baseDir = function() {
throw Error('Filter must implement prototype.baseDir');
};

/*
* @public
*
* @method cacheKeyProcessString
* @return {String} this filters top-level cache key
*/
Filter.prototype.cacheKeyProcessString = function(string) {
return md5Hex(string);
};

Filter.prototype.canProcessFile =
function canProcessFile(relativePath) {
return !!this.getDestFilePath(relativePath);
Expand Down Expand Up @@ -160,19 +203,32 @@ Filter.prototype.processFile =
if (outputEncoding === void 0) outputEncoding = 'utf8';
var contents = fs.readFileSync(
srcDir + '/' + relativePath, { encoding: inputEncoding });
var promise;

return Promise.resolve(this.processString(contents, relativePath)).
then(function asyncOutputFilteredFile(outputString) {
var outputPath = self.getDestFilePath(relativePath);
if (outputPath == null) {
throw new Error('canProcessFile("' + relativePath + '") is true, but getDestFilePath("' + relativePath + '") is null');
}
outputPath = destDir + '/' + outputPath;
mkdirp.sync(path.dirname(outputPath));
fs.writeFileSync(outputPath, outputString, {
encoding: outputEncoding
});
});
if (this.persistent) {
var key = this.cacheKeyProcessString(contents, relativePath);
promise = this._peristentCache.get(key).then(function(entry) {
return entry.isCached ? entry.value : self.processString(contents, relativePath);
});
} else {
promise = Promise.resolve(this.processString(contents, relativePath));
}

return promise.then(function asyncOutputFilteredFile(outputString) {
var outputPath = self.getDestFilePath(relativePath);
if (outputPath == null) {
throw new Error('canProcessFile("' + relativePath + '") is true, but getDestFilePath("' + relativePath + '") is null');
}
outputPath = destDir + '/' + outputPath;
mkdirp.sync(path.dirname(outputPath));
fs.writeFileSync(outputPath, outputString, {
encoding: outputEncoding
});

if (self.persistent) {
return self._peristentCache.set(key, outputString);
}
});
};

Filter.prototype.processString =
Expand Down
3 changes: 3 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,13 @@
"cache"
],
"dependencies": {
"async-disk-cache": "^1.0.0",
"broccoli-kitchen-sink-helpers": "^0.2.7",
"broccoli-plugin": "^1.0.0",
"copy-dereference": "^1.0.0",
"debug": "^2.2.0",
"hash-for-dep": "0.0.3",
"md5-hex": "^1.0.2",
"mkdirp": "^0.5.1",
"promise-map-series": "^0.2.1",
"rsvp": "^3.0.18",
Expand Down
Empty file removed test/fixtures/directory/.gitkeep
Empty file.
2 changes: 1 addition & 1 deletion test/key-for-file-tests.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ describe('keyForFile', function () {
describe('when given a path to a directory', function () {
it('throws an error', function () {
expect(function () {
keyForFile('./test/fixtures/directory');
keyForFile('./test/fixtures/dir');
}).to.throw(/cannot diff directory/i);
});
});
Expand Down
56 changes: 56 additions & 0 deletions test/test.js
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ ReplaceFilter.prototype.processString = function(contents, relativePath) {
return result;
};

ReplaceFilter.prototype.baseDir = function() {
return '../';
};

function IncompleteFilter(inputTree, options) {
if (!this) return new IncompleteFilter(inputTree, options);
Filter.call(this, inputTree, options);
Expand Down Expand Up @@ -290,6 +294,58 @@ describe('Filter', function() {
to.equal('utf8');
});

if (!/^win/.test(process.platform)) {
describe('persistent cache', function() {
var f;
function F(inputTree, options) { Filter.call(this, inputTree, options); }
inherits(F, Filter);
F.prototype.baseDir = function() {
return '../';
};

beforeEach(function() {
f = new F(fixturePath, { persist: true });
});

it('cache is initialized', function() {
expect(f._peristentCache).to.be.ok;
});

it('default `baseDir` implementation throws an Unimplemented Exception', function() {
function F(inputTree, options) { Filter.call(this, inputTree, options); }
inherits(F, Filter);
expect(function() {
new F(fixturePath, { persist: true });
}).to.throw(/Filter must implement prototype.baseDir/);
});

it('`cacheKeyProcessString` return correct first level file cache', function() {
expect(f.cacheKeyProcessString('foo-bar-baz', 'relative-path')).to.eql('4c43793687f9a7170a9149ad391cbf70');
});

it('filter properly reads file tree', function() {
var builder = makeBuilder(ReplaceFilter, fixturePath, function(awk) {
return awk;
});

return builder('dir', {
persist: true,
glob: '**/*.md',
search: 'dogs',
replace: 'cats'
}).then(function(results) {
expect(results.files).to.deep.eql([
'a/',
'a/README.md',
'a/bar/',
'a/bar/bar.js',
'a/foo.js'
]);
});
});
});
}

describe('processFile', function() {
beforeEach(function() {
sinon.spy(fs, 'mkdirSync');
Expand Down