unit-tests-google-tracker-sheet #48
No reviewers
Labels
No labels
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Support
Type
Task
Type
Testing
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleverdatasets/dataset-uploader!48
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "unit-tests-google-tracker-sheet"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This branch contains the behave bdd tests for google_sheets_tracker.py and also for upload_all_datasets.py
ruff format. 7f50361683ruff check --fix. c5707689d6upload_all_datasets.2.pywritten. Will run overnight. 1d5e791963dataset_dict.save_to_disk. 68587e5e6cdataset_config. c152323097pyrightapproval. e7a540aeb3pyprojectto require Python 3.10. da9f785326WIP: unit-tests-google-tracker-sheetto unit-tests-google-tracker-sheetI have one comment that i think might be relevant to address.
@ -0,0 +143,4 @@context.auth_result = context.tracker.authenticate()except Exception as e:context.auth_result = Falsecontext.error = str(e)I am not sure if i am okay with steps like these that are so generic... i mean, it can silently hide the true failure reason as it tries so many different things depending on the context. i think this step should be split into more fine grained steps that do well defined things and easier to manually check when something fails.
fixed !
6af250fc78to7f6594d99aA lot of little changes. I didn't have time to read the
/features/stepsfiles.@ -0,0 +44,4 @@Scenario: Column letter conversion for column ABWhen I convert column number 28 to letterThen the column letter result should be "AB"The last four scenarios are duplicated in features/google_sheets_tracker/column_letter_conversion.feature , and can be removed.
fixed !!
@ -0,0 +41,4 @@Scenario: Convenience methods return success statusWhen I mark "wordnet" as in progressThen the mark_in_progress method should return TrueI would be more comfortable if this test added something like
and the negative tests for the next three tests as well.
fixed !!
My apologies if I weren't clear enough.
The current lines 42-53 of
convenience_methods.featureare:I would be more comfortable if the above were replaced with something like
I apologize, I incorrectly replied to this comment as fixed.
I don't think the test you suggested matches our real-world use cases. Here's why:
Actually, calling these methods sequentially does happen in production:
Script Interruption (Ctrl+C): First mark_in_progress is called, status becomes IN PROGRESS. Then user presses Ctrl+C. Now reset_to_not_started must work and return True to reset the dataset. If it returns False, the dataset stays stuck as IN PROGRESS forever.
Processing Error: First mark_in_progress is called successfully. Then download fails immediately. Now mark_error must work and return True to record what went wrong. If it returns False, we can't record the error.
Successful Completion: First mark_in_progress is called. Processing succeeds. Then mark_completed must work and return True to mark it done. If it returns False, we can't mark the work as complete.
These are all valid workflows that require methods to work together, not exclude each other. The test expectations would break our error handling and recovery mechanisms.
Moreover, google sheet tracker module is not used independently, its only being used with the automation scripts like rdf_to_hf_incremental.py and upload_all_datasets.py which use the validation method check_should_process before calling any of these convenience methods.
@ -0,0 +26,4 @@| Dataset Name | Status | Error || wordnet | NOT STARTED | |When I attempt to mark "nonexistent" as in progressThen the update should failThis test seems very close to features/google_sheets_tracker/convenience_methods.feature line 56.
fixed !!
@ -0,0 +49,4 @@Given an authenticated sheets trackerAnd a sheet with empty rowsWhen I get the status of "wordnet"Then the status should be NoneThis test seems almost identical to features/google_sheets_tracker/error_handling.feature line 16.
fixed !!
@ -0,0 +68,4 @@And a sheet with only one row (headers only)When I get the status of "wordnet"Then the status should be NoneAnd the row should be NoneThis test also seems almost identical to features/google_sheets_tracker/error_handling.feature line 16.
fixed !!
@ -0,0 +76,4 @@| Dataset Name | Status | Error || wordnet | NOT STARTED | |When I attempt to update status for "nonexistent-dataset"Then the update should return FalseThis test is almost the same as features/google_sheets_tracker/convenience_methods.feature lines 56-63.
fixed !!
@ -0,0 +25,4 @@And a sheet with headers "ID, Status, Error"When the tracker loads headersThen dataset_name_col should be identified as 0And status_col should be 1Why isn't
errorchecked here?fixed !!
@ -0,0 +32,4 @@And a sheet with headers "Dataset Name, State, Error"When the tracker loads headersThen dataset_name_col should be 0And status_col should be identified as 1Why isn't
errorchecked here?fixed !!
@ -0,0 +39,4 @@And a sheet with headers "Dataset Name, Status, Error Message"When the tracker loads headersThen dataset_name_col should be 0And error_col should be identified as 2Why isn't
status_colchecked here?fixed !!
@ -0,0 +45,4 @@Scenario: Process when tracker not authenticatedGiven an unauthenticated sheets trackerWhen I check if "wordnet" should be processedThen the decision should be to process"should be" to process? Is it possible that this is in the wrong direction?
fixed !!
@ -0,0 +50,4 @@Scenario: Error message truncationWhen I reset "wordnet" to not started with error message longer than 500 charactersThen the error for "wordnet" should be at most 500 charactersThis test looks similar to features/google_sheets_tracker/error_message_handling.feature line 12.
fixed !!
@ -0,0 +55,4 @@| wordnet | NOT STARTED | |When I attempt to update status for "nonexistent"Then the update should fail gracefullyAnd no exception should be raisedThis test seems very close to features/google_sheets_tracker/convenience_methods.feature line 56.
fixed !!
@ -0,0 +62,4 @@| Dataset Name | Status | Error || wordnet | NOT STARTED | |When I attempt to update status with very long valueThen the update should handle the long value correctlyIn other places, the limit was 500 characters. Is that also true here?
fixed !!
@ -0,0 +7,4 @@Given an authenticated sheets trackerScenario: Update status with only status column present (no error column)Given a sheet with headers "Dataset Name, Status"The
Givenshould probably say something about this sheet being real not virtual. This test looks very similar to features/google_sheets_tracker/update_status_edge_cases.feature lines 10-16, so I would have expected the setup to be the same... but only at line 14 does it say that the update is with a real API call.The same should be done for the rest of the tests. Compare them against
update_status_edge_casesand make sure that there is a difference in theGivenstatement.fixed !!
@ -0,0 +1,2646 @@"""Step definitions for Google Sheets tracker tests."""There are three ways you could organize this file:
@givens together, then all@whens, then all@thens..featurefile. Within each.featurefile, put all@givens together, then all@whens, then all@thens.@given,@whenand@then. (This will make the file a lot longer, and you'll need to have references whenever you have the same@givenor whatnot.)It looks like this code was kinda-sorta organized according to 3.
fixed !!
Which one of the three ways did you organize the file?
When I look at !48 (commit
ac1cc304f9) , I don't see much organization.(HINT: You could have said "I don't think this is important enough to do.", and I would have agreed. But you reported that you fixed it.)
I misunderstood. I didn't actually reorganize it. Given the size of the file (2,667 lines), reorganizing it would be a significant effort, and it doesn’t seem important enough to prioritize right now.
@ -0,0 +95,4 @@if header.strip().lower() in ["status", "processing status"]:context.tracker.status_col = idxelif header.strip().lower() in ["error", "error message"]:context.tracker.error_col = idxDoes the tracker remember the column of the dataset name?
fixed !!
@ -0,0 +2630,4 @@Is there a reason for this long gap?
fixed !!
There's still an eleven-line gap between lines 2645-2656. Thanks for eliminating two of the lines, but the long gap remains.
(It's not necessarily WRONG. It just very much stands out when one reads the code.)
Could you either drop it down to two lines or just say what the gap is about?
@ -523,2 +582,2 @@graph = Graph()graph.parse(str(file_path), format=rdf_format)# Check if this is GeoNames format (special handling required)if rdf_format in ("xml", "application/rdf+xml") and _detect_geonames_format(file_path):According to
ruff check:fixed !!
@ -828,0 +965,4 @@subject=subject,predicate=predicate,object=obj,object_type="literal", # TSV typically has literalsAccording to
ruff check:fixed !!
@ -828,0 +970,4 @@object_language=None))except Exception as e:logger.debug(f"Skipping malformed TSV line {line_num}: {e}")According to
ruff check:fixed !!
@ -840,0 +990,4 @@# Create train/test split if requestedif config.create_train_test_split:assert isinstance(dataset, Dataset), "Expected Dataset instance"train_test = dataset.train_test_split(test_size=config.test_size, seed=42)According to
ruff check:fixed !!
@ -1160,2 +1324,2 @@f"{num_workers} workers[/yellow]")# Handle compressed filesis_gzip = config.input_path.suffix == ".gz" or str(config.input_path).endswith(".gz")According to
ruff check:fixed !!
@ -1162,0 +1327,4 @@try:if is_gzip:file_obj = gzip.open(config.input_path, "rt", encoding="utf-8", errors="ignore") # noqa: SIM115According to
ruff check:fixed !!
@ -1165,3 +1334,1 @@f"[yellow]Using standard RDF parser for "f"{config.rdf_format} (single-threaded)[/yellow]")file_obj = open(config.input_path, encoding="utf-8", errors="ignore") # noqa: SIM115According to
ruff check:fixed !!
@ -1168,0 +1350,4 @@finally:file_obj.close()except Exception as e:logger.warning(f"Error detecting GeoNames format: {e}, using generic parser")According to
ruff check:fixed !!
@ -1249,0 +1435,4 @@# Create train/test split if requestedif config.create_train_test_split:assert isinstance(dataset, Dataset), "Expected Dataset instance"train_test = dataset.train_test_split(test_size=config.test_size, seed=42)According to
ruff check:fixed !!
@ -1278,0 +1473,4 @@# Use single process for small datasets to avoid index errors# Only use multiprocessing for datasets with > 1000 examplesif total_triples > 1000 and num_workers > 1:dataset_dict.save_to_disk(str(config.output_path), num_proc=num_workers)According to
ruff check:fixed !!
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.Merge
Merge the changes and update on Forgejo.Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.