Skip to content

Fix corrupted block causing reads to fail#17564

Merged
alluxio-bot merged 8 commits intoAlluxio:master-2.xfrom
dbw9580:fix/block-corruption
Jun 12, 2023
Merged

Fix corrupted block causing reads to fail#17564
alluxio-bot merged 8 commits intoAlluxio:master-2.xfrom
dbw9580:fix/block-corruption

Conversation

@dbw9580
Copy link
Copy Markdown
Contributor

@dbw9580 dbw9580 commented Jun 6, 2023

What changes are proposed in this pull request?

Fix read failure when a mismatch occurs between the block size recorded in memory by BlockMeta and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS.

Why are the changes needed?

This causes reading the block to fail.

Does this PR introduce any user facing changes?

No.

@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 6, 2023

To detect corrupted blocks, two extra file system calls File.exists and File.length are needed on every block read access. Testing is needed to evaluate the performance implications.

@dbw9580 dbw9580 requested review from apc999 and beinan June 6, 2023 12:16
Copy link
Copy Markdown
Contributor

@beinan beinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me though there might be very minor performance impact on the read path

@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, merge this please.

@alluxio-bot
Copy link
Copy Markdown
Contributor

merge failed:
Merge refused because pull request does not have label start with type-

@dbw9580 dbw9580 added type-bug This issue is about a bug area-worker Alluxio worker labels Jun 12, 2023
@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit 87ddfd4 into Alluxio:master-2.x Jun 12, 2023
@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, cherry-pick this branch 2.10 please

@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, cherry-pick this to branch-2.10 please

@alluxio-bot
Copy link
Copy Markdown
Contributor

Auto cherry-pick to branch branch-2.10 successfully opened PR: #17594

alluxio-bot pushed a commit that referenced this pull request Jun 12, 2023
### What changes are proposed in this pull request? Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. ### Why are the changes needed? This causes reading the block to fail. ### Does this PR introduce any user facing changes? No.	pr-link: #17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, cherry-pick this to branch-2.9 please

@alluxio-bot
Copy link
Copy Markdown
Contributor

Auto cherry-pick unsuccessful Failed to setup local git for auto cherry-pick

@dbw9580
Copy link
Copy Markdown
Contributor Author

dbw9580 commented Jun 12, 2023

alluxio-bot, cherry-pick this to branch-2.9 please

@alluxio-bot
Copy link
Copy Markdown
Contributor

Auto cherry-pick unsuccessful Failed to setup local git for auto cherry-pick

alluxio-bot added a commit that referenced this pull request Jun 12, 2023
Cherry-pick of existing commit. orig-pr: #17564 orig-commit: 87ddfd4 orig-commit-author: Bowen Ding <6999708+dbw9580@users.noreply.github.com>	pr-link: #17594	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
alluxio-bot pushed a commit that referenced this pull request Jun 12, 2023
Manual cherry-pick of #17564.	pr-link: #17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b	pr-link: #17595	change-id: cid-7ebcd5767243e4e6d65ec7105baecea2f72af433
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request Jun 30, 2023
…reads to fail Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. This causes reading the block to fail. No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request Jul 11, 2023
…reads to fail Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. This causes reading the block to fail. No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
dbw9580 added a commit to jiacheliu3/alluxio that referenced this pull request Jul 12, 2023
…reads to fail Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. This causes reading the block to fail. No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request Jul 13, 2023
…reads to fail Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. This causes reading the block to fail. No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
maobaolong pushed a commit to maobaolong/alluxio that referenced this pull request Jan 3, 2024
### What changes are proposed in this pull request? Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. ### Why are the changes needed? This causes reading the block to fail. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-worker Alluxio worker type-bug This issue is about a bug

3 participants