feat: FK Models #584

lukaszkolodziejczyk · 2025-09-29T05:31:19Z

No description provided.

…management - Add PartitionedDataset class with LRU caching and fast metadata reading - Refactor step_finalize_generation.py to use unified processing pipeline - Implement sequential parent cache clearing after each FK relationship - Add comprehensive test coverage for PartitionedDataset (22 tests) - Maintain backward compatibility for both FK and non-FK processing - Optimize memory usage through batch processing and cache management - Fix test compatibility with new partitioned data processing Key improvements: - Addresses quadratic complexity in FK matching through efficient sampling - Memory-bounded processing with configurable LRU cache (default: 3 partitions) - Fast row counting using PyArrow parquet metadata - Sequential cache clearing prevents memory accumulation during FK processing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Support max_cached_partitions=-1 to keep all partitions in memory without LRU eviction. This enables optimal performance for scenarios where memory usage is not a constraint. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

lukaszkolodziejczyk added 15 commits September 25, 2025 20:43

wip

9a96ded

wip

d1e3f8d

wip

6a4cf15

simplify prepare_training_data

4253039

simplify

65d9692

ref

101dd55

Merge remote-tracking branch 'origin/main' into feat-ai-smartselect

2b518f4

wip

87e6dfc

wip

7c64679

wip

d8c817f

our encodings

8ebc6a9

our encoders - part 2

e49424e

claude code - null support

8010267

claude code - temperature and top_k

7064474

claude code - use _is_null column

60a9ac2

lukaszkolodziejczyk changed the title ~~feat: AI Smart Select~~ feat: FK Models Oct 2, 2025

lukaszkolodziejczyk and others added 14 commits October 2, 2025 21:38

cleanup

1d20535

bump engine

47bbb91

fix

7de38ca

test

84818fe

fix

b6a8d02

Fix FK interleaving for independent parent pools per child

80748df

linting

75c0688

new fk_model data pulling logic

03bbc66

reset_index bug fix

f080e80

code simplifications

df2edc9

improved logging and default parameters

c83c58d

improved parent batch sampling strategy to improve efficiency

dd24ded

abon-mostly and others added 3 commits October 21, 2025 17:24

added try catch

f5f1f5b

Merge remote-tracking branch 'origin/main' into feat-ai-smartselect

f66c7ab

revert prohgress change

ea79d6b

This comment was marked as outdated.

Sign in to view

lukaszkolodziejczyk added 4 commits October 22, 2025 15:03

remove some hardcode

17ec81f

remove some hardcode

d2fb933

improvs

37b27ab

improvs

fce3132

This comment was marked as outdated.

Sign in to view

lukaszkolodziejczyk added 5 commits October 22, 2025 16:32

improvs

79275a8

improvs

7618802

merge fk_models.py and non_context.py

21cb241

refs

299fe54

refs

65d62af

This comment was marked as outdated.

Sign in to view

lukaszkolodziejczyk added 8 commits October 22, 2025 18:45

refs

20fa265

partition by partition in random fk assignment;

f7f1c81

partition by partition in random fk assignment;

58a9539

refs

555cace

fix infinite loop

1d2ea8b

safe

9926f59

top_k=20

551b5c6

remove defaults

1804f72

This comment was marked as outdated.

Sign in to view

lukaszkolodziejczyk added 4 commits October 22, 2025 22:24

normalize probs

e6497ca

softmax

6512691

softmax

e27c47c

fix typing

d66419c

lukaszkolodziejczyk merged commit f04ef22 into main Oct 22, 2025
8 of 9 checks passed

lukaszkolodziejczyk deleted the feat-ai-smartselect branch October 22, 2025 13:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: FK Models #584

feat: FK Models #584

Uh oh!

lukaszkolodziejczyk commented Sep 29, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: FK Models #584

feat: FK Models #584

Uh oh!

Conversation

lukaszkolodziejczyk commented Sep 29, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants