Doesn't work for large files #1

saschalalala · 2015-11-01T21:08:56Z

Hey, I tried processing a quite large file (476MB) and get an allocation size overflow.

evitolins · 2017-03-26T23:14:29Z

Having the same issue in FF 52.0.1 with a 314.8 MB .json file

faddat · 2017-04-29T08:10:22Z

500MB file dying here as well.

danielhickman · 2017-06-30T02:22:14Z

Maybe someone submitting a PR with an implementation of something like this? I'm not sure if it is similar but I really only wanted a single conversation so I wrote a Ruby script to split my JSON file into many by ID. It messily made ~50 files and you'll need to search the original file to figure out which ID you need. You can also one by one add them to Hangouts Reader them since it doesn't destroy previous parses.

require 'json'

def getJSON(file)
	if File.readable?(file)
		$data = JSON.parse(IO.read(file))["conversation_state"]
	end
end


getJSON("Hangouts.json")
puts "Parsed File"

$restructured = {}
$data.each do |i|
	id = i["conversation_id"]["id"]
	puts "Found conversation in #{id}"
	if !$restructured[id]
		$restructured[id] = {"conversation_state" => []}
	end
	$restructured[id]["conversation_state"] << i
end
puts "Finished sorting"

$restructured.each do |key, value|
	puts "Generating #{key}"
	output = File.new("#{key}.json", "w+")
	output.write(JSON.generate(value))
end
puts "Done"

I imported the 10mb file just fine, but results may vary if you have a really long chat you'd like to import.

crutchcorn · 2018-11-26T09:37:32Z

I couldn't quite figure out the Ruby program, so I write my own with Node and a ton of un-needed dependancies that I used for laziness:

const jsonn = require("jsonstream"),
  fs = require("fs"),
  util = require("util"),
  fs_writeFile = util.promisify(fs.writeFile),
  rxjs = require("rxjs"),
  { debounceTime } = require("rxjs/operators");

const dataa = {};

const sub = new rxjs.Subject();

sub.pipe(debounceTime(1000)).subscribe(data => {
  console.log(`${data.new ? "new" : "old"}Id, ${data.id}`);
});

fs.createReadStream("./Hangouts.json")
  .pipe(jsonn.parse("conversations.*"))
  .on("data", data => {
    const id =
      data["conversation"] &&
      data["conversation"]["conversation_id"] &&
      data["conversation"]["conversation_id"]["id"];

    if (id) {
      if (dataa[id]) {
        sub.next({ new: false, id });
        dataa[id].conversations.push(data);
      } else {
        sub.next({ new: true, id });
        dataa[id] = {
          conversations: [data]
        };
      }
    }
  })
  .on("end", () => {
    Object.keys(dataa).reduce(async (prev, key) => {
        await prev;
        console.log('writing', key);
      await fs_writeFile(`${key}.json`, JSON.stringify(dataa[key]));
      delete dataa[key];
    }, Promise.resolve());
  });

Tested with 700MB files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doesn't work for large files #1

Doesn't work for large files #1

saschalalala commented Nov 1, 2015 •

edited

Loading

evitolins commented Mar 26, 2017

faddat commented Apr 29, 2017

danielhickman commented Jun 30, 2017

crutchcorn commented Nov 26, 2018 •

edited

Loading

Doesn't work for large files #1

Doesn't work for large files #1

Comments

saschalalala commented Nov 1, 2015 • edited Loading

evitolins commented Mar 26, 2017

faddat commented Apr 29, 2017

danielhickman commented Jun 30, 2017

crutchcorn commented Nov 26, 2018 • edited Loading

saschalalala commented Nov 1, 2015 •

edited

Loading

crutchcorn commented Nov 26, 2018 •

edited

Loading