Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for charts getting data from embedded excel sheets (rather than 'standard' charts) #102

Open
MP70 opened this issue Apr 20, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@MP70
Copy link
Collaborator

MP70 commented Apr 20, 2024

setChartData throws

Error: Could not find file ppt/charts/undefinedundefined.xml@Recreated_Presentation_OUTO1.pptx at ArchiveJszip.<anonymous> (/Users/matt/dev/pptxtopptxgenjs/node_modules/pptx-automizer/dist/helper/archive/archive-jszip.js:57:23)

on these charts.

@MP70 MP70 self-assigned this Apr 20, 2024
@MP70 MP70 added the bug Something isn't working label Apr 20, 2024
@singerla
Copy link
Owner

Hi @MP70! Can I help you on this? Please provide an example pptx file, I'm always interested in new varieties :)

@MP70
Copy link
Collaborator Author

MP70 commented Apr 20, 2024

Sorry! As attached. So I think I have a solution but it is super super involved and therefore I'm not sure if it's best inside the library or not.

We have two options;
1/ Go edit the linked xlsx file
2/ Delete the linked xlsx file, and its refs element, then edit the chart structure to allow for embedding the data directly as normal.

Either are quite involved and potentially fragile.

Recreated_Presentation_OUTO1.pptx

@MP70
Copy link
Collaborator Author

MP70 commented Apr 20, 2024

Example to get the data I am doing this ATM

  for (let element of elements) {
      let refElements = element.getElementsByTagName("c:numRef");
      if (refElements.length > 0) {
          let fElement = refElements[0].getElementsByTagName("c:f");
          if (fElement.length > 0) {
              let excelEmbedInfo = fElement[0].textContent;
              let extractedData = await this.extractExcelData(excelEmbedInfo, zip);
              data = data.concat(extractedData);
          } else {
              let valueElements = refElements[0].getElementsByTagName("c:v");
              let values = Array.from(valueElements, elem => parseFloat(elem.textContent));
              data = data.concat(values);
          }
      }
  }
  return data;
}

async extractExcelData(excelEmbedInfo, zip) {
  console.log(excelEmbedInfo)
  const excelFilePath = excelEmbedInfo.name

  if (excelFilePath) {
     const excelData = await zip.files[excelFilePath].async("nodebuffer");
     console.log(excelData);
      return this.parseExcelData(excelData);
  } else {
      console.error(`Excel file not found at path: ${excelFilePath}`);
      return [];
  }
}

parseExcelData(excelBuffer) {
  // currently using 'xlsx' library to parse Excel buffer, would prob need to add this dep..
  let workbook = XLSX.read(excelBuffer, {type: 'buffer'});
  let sheetName = workbook.SheetNames[0];
  let worksheet = workbook.Sheets[sheetName];
  let data = XLSX.utils.sheet_to_json(worksheet, {header:1});
  return data; // Flatten if the data structure is nested
}

that returns

 [ '', 'Series 1', 'Series 2', 'Series 3' ],
 [ 'Kategorie 1', 20, 20, 20 ],
 [ 'Kategorie 2', 10, 10, 10 ],
 [ 'Kategorie 3', 10, 10, 10 ],
 [ 'Kategorie 4', 10, 10, 10 ]
]```

@singerla
Copy link
Owner

The Error: Could not find file is basically caused by an absolute path in slide relations file:

<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/chart" Target="/ppt/charts/chart1.xml"/>

Please see my latest commit.
Have a great week! :)

@singerla
Copy link
Owner

Hey @MP70, I've added an example to read data from a chart worksheets. I'm using a normal modifier to walk through rows and columns. I think it would be useful to have more "readers" in the future.

@llermaly
Copy link

Something to add to this discussion is the existing modifier will clean the underlying excel file but will not clean formulas, so if you modify a chart it will look fine, but then if you click "edit excel data" formulas will trigger, and if in the original file cells where references to elsewhere then the chart will break.

This approach gives room to control things like that?

@singerla
Copy link
Owner

singerla commented Aug 2, 2024

You could try to invoke a library like node-xlsx or the underlying sheetjs if you need more complex manipulation of xlsx worksheets. You just need to override the worksheet after automation. In my use cases, I always need to remove formulas on preparing a chart and do calculations elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants