Skip to content

Conversation

@lbliii
Copy link
Contributor

@lbliii lbliii commented Oct 3, 2025

No description provided.

lbliii and others added 30 commits September 22, 2025 13:56
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
@lbliii lbliii self-assigned this Oct 3, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 3, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Lawrence Lane <[email protected]>
@lbliii lbliii marked this pull request as ready for review October 3, 2025 20:37
@lbliii lbliii requested a review from arhamm1 October 3, 2025 20:37
Signed-off-by: Lawrence Lane <[email protected]>
@lbliii lbliii enabled auto-merge (squash) October 9, 2025 15:16
@lbliii lbliii requested a review from ayushdg October 9, 2025 15:16
auto-merge was automatically disabled October 9, 2025 15:33

Pull Request is not mergeable

@lbliii lbliii enabled auto-merge (squash) October 9, 2025 15:46
@lbliii lbliii requested a review from sarahyurick October 13, 2025 16:23
@lbliii lbliii merged commit b46fcfb into NVIDIA-NeMo:main Oct 13, 2025
11 checks passed
lbliii added a commit to lbliii/NeMo-Curator that referenced this pull request Oct 22, 2025
* text curation updates

Signed-off-by: Lawrence Lane <[email protected]>

* concepts

Signed-off-by: Lawrence Lane <[email protected]>

* remove synthetic docs not for this release

Signed-off-by: Lawrence Lane <[email protected]>

* updates

Signed-off-by: Lawrence Lane <[email protected]>

* text concepts and getting started changes

Signed-off-by: Lawrence Lane <[email protected]>

* links, concepts

Signed-off-by: Lawrence Lane <[email protected]>

* crosslinks

Signed-off-by: Lawrence Lane <[email protected]>

* quality assessment updates

Signed-off-by: Lawrence Lane <[email protected]>

* more cleanup

Signed-off-by: Lawrence Lane <[email protected]>

* semdedup

Signed-off-by: Lawrence Lane <[email protected]>

* example import cleanup

Signed-off-by: Lawrence Lane <[email protected]>

* concepts

Signed-off-by: Lawrence Lane <[email protected]>

* Update docs/about/concepts/text/data-acquisition-concepts.md

Co-authored-by: Praateek Mahajan <[email protected]>
Signed-off-by: L.B. <[email protected]>

* feedback batch 1

Signed-off-by: Lawrence Lane <[email protected]>

* feedback batch 2

Signed-off-by: Lawrence Lane <[email protected]>

* file_paths="/path/to/jsonl_directory",

Signed-off-by: Lawrence Lane <[email protected]>

* revert removal of xenna for common crawl executors

Signed-off-by: Lawrence Lane <[email protected]>

* quickstart installation steps

Signed-off-by: Lawrence Lane <[email protected]>

* Update docs/about/concepts/text/data-acquisition-concepts.md

Co-authored-by: Sarah Yurick <[email protected]>
Signed-off-by: L.B. <[email protected]>

* data loading concepts updates / simplification

Signed-off-by: Lawrence Lane <[email protected]>

* data processing feedback

Signed-off-by: Lawrence Lane <[email protected]>

* read-existing pg updates

Signed-off-by: Lawrence Lane <[email protected]>

* add-id updates

Signed-off-by: Lawrence Lane <[email protected]>

* dedup updates

Signed-off-by: Lawrence Lane <[email protected]>

* feedback

Signed-off-by: Lawrence Lane <[email protected]>

* citation file

Signed-off-by: Lawrence Lane <[email protected]>

* fix

Signed-off-by: Lawrence Lane <[email protected]>

* updates

Signed-off-by: Lawrence Lane <[email protected]>

---------

Signed-off-by: Lawrence Lane <[email protected]>
Signed-off-by: L.B. <[email protected]>
Co-authored-by: Praateek Mahajan <[email protected]>
Co-authored-by: Sarah Yurick <[email protected]>
Signed-off-by: Lawrence Lane <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants