Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch importing data results in IllegalArgumentException when data is unordered in time #168

Open
mbranden opened this issue Feb 19, 2013 · 2 comments

Comments

@mbranden
Copy link

My workflow for feeding opentsdb involves off-line processing of numerous log files and generating a single stream of importable data for 'tsdb import'. This data may not be ordered in time and this results in an exception being thrown out of addPointInternal():

2013-02-19 15:37:16,218 INFO  [New I/O  worker #1] HBaseClient: Added client for region RegionInfo(table="tsdb-uid", region_name="tsdb-uid,,1361288131750.802dddc3726d461678a822323c4e28f6.", stop_key=""), which was added to the regions cache.  Now we know that RegionClient@1738275632(chan=[id: 0x7be11b78, /127.0.0.1:44445 => /127.0.0.1:50564], #pending_rpcs=0, #batched=0, #rpcs_inflight=0) is hosting 2 regions.
2013-02-19 15:37:16,223 INFO  [New I/O  worker #1] HBaseClient: Added client for region RegionInfo(table="tsdb", region_name="tsdb,,1361288133012.99acbf016aabf8f48d33904aa6be4052.", stop_key=""), which was added to the regions cache.  Now we know that RegionClient@1738275632(chan=[id: 0x7be11b78, /127.0.0.1:44445 => /127.0.0.1:50564], #pending_rpcs=0, #batched=0, #rpcs_inflight=1) is hosting 3 regions.
2013-02-19 15:37:16,417 ERROR [main] TextImporter: Exception caught while processing file tsdb-feb-14-import.txt line=apache2.resp_time 1360736752 0.001000 host=sim7000.nandi.lindenlab.com scheme=http instance=top
2013-02-19 15:37:16,428 INFO  [New I/O  worker #1] HBaseClient: Lost connection with the -ROOT- region
Exception in thread "main" java.lang.IllegalArgumentException: New timestamp=1360736752 is less than previous=1360861392 when trying to add value=[58, -125, 18, 111] to IncomingDataPoints([0, 0, 1, 81, 29, 24, 16, 0, 0, 1, 0, 0, 1, 0, 0, 2, 0, 0, 2, 0, 0, 3, 0, 0, 3] (metric=apache2.resp_time), base_time=1360861200 (Thu Feb 14 17:00:00 UTC 2013), [+12:float(0.0010000000474974513), +72:float(0.0010000000474974513), +132:float(0.0010000000474974513), +192:float(0.0010000000474974513)])
        at net.opentsdb.core.IncomingDataPoints.addPointInternal(IncomingDataPoints.java:201)
        at net.opentsdb.core.IncomingDataPoints.addPoint(IncomingDataPoints.java:283)
        at net.opentsdb.tools.TextImporter.importFile(TextImporter.java:153)
        at net.opentsdb.tools.TextImporter.main(TextImporter.java:72) 

I haven't looked at this closely to see if this test is restricted to a single time sequence or if it has wider scope.

I'm dealing with this by simply sorting the unified stream before importing. So this may be taken as a bug, a feature request or just a documentation request for a works-as-intended test.

@tsuna
Copy link
Member

tsuna commented Mar 8, 2013

The test applies only to a single time series (remember a time series is uniquely defined by a metric name and a set of tags).

This check here is intentional, because importing data in-order allows for various optimizations, which are especially important for batch imports. Is it OK for you to keep sorting the data before you batch-import it?

@IDerr
Copy link

IDerr commented Sep 19, 2017

Hi @mbranden
If your question has been answered, could you please close this issue ?

thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants