xml-streaming-optimization #49

Open
aditya wants to merge 4 commits from xml-streaming-optimization into auto-excel-update
Member

This branch contains optimized code which skips the description parsing during the resume and only counts the description count, this optimization reduces the overhead of parsing the entire dataset again. This code also contain multi threading implementation for description parsing and checkpoint update using tracker sheet.

This branch contains optimized code which skips the description parsing during the resume and only counts the description count, this optimization reduces the overhead of parsing the entire dataset again. This code also contain multi threading implementation for description parsing and checkpoint update using tracker sheet.
khird approved these changes 2026-01-27 15:29:10 +00:00
@ -370,0 +406,4 @@
self.sheet.values()
.get(
spreadsheetId=self.spreadsheet_id,
range=f"{self.worksheet_name}!A{row_num}:Z{row_num}",
First-time contributor

Hardcoding A and Z columns looks suspicious - is this just a conservative estimate or do we know that the data we want is in this subset of the sheet?

Hardcoding A and Z columns looks suspicious - is this just a conservative estimate or do we know that the data we want is in this subset of the sheet?
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin xml-streaming-optimization:xml-streaming-optimization
git switch xml-streaming-optimization

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch auto-excel-update
git merge --no-ff xml-streaming-optimization
git switch xml-streaming-optimization
git rebase auto-excel-update
git switch auto-excel-update
git merge --ff-only xml-streaming-optimization
git switch xml-streaming-optimization
git rebase auto-excel-update
git switch auto-excel-update
git merge --no-ff xml-streaming-optimization
git switch auto-excel-update
git merge --squash xml-streaming-optimization
git switch auto-excel-update
git merge --ff-only xml-streaming-optimization
git switch auto-excel-update
git merge xml-streaming-optimization
git push origin auto-excel-update
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleverdatasets/dataset-uploader!49
No description provided.