Skip to content

Remove async persist -- master side#17980

Open
jiacheliu3 wants to merge 851 commits intoAlluxio:mainfrom
jiacheliu3:remove-async-persist
Open

Remove async persist -- master side#17980
jiacheliu3 wants to merge 851 commits intoAlluxio:mainfrom
jiacheliu3:remove-async-persist

Conversation

@jiacheliu3
Copy link
Copy Markdown
Contributor

@jiacheliu3 jiacheliu3 commented Aug 12, 2023

What changes are proposed in this pull request?

This change removes async persist implementation from the Master component. In specific, the below elements are gone:

  1. Removed PersistenceChecker and PersistentScheduler and corresponding persistence tracking logic from the DefaultFileSystemMaster
  2. Removed AsyncPersistHandler on the Master. It has actually been obsolete in 2.x and was never removed. Some relevant utilities like FileSystemMasterView are also removed.
  3. Methods on async persist like FileSystemMaster.scheduleAsyncPersistence() are removed.
  4. Corresponding tests and test cases are removed.

This change focuses on master-side code, while #17963 rather targets the ASYNC_THROUGH write type itself. Therefore, some code are not master-specific and are therefore not removed:

  1. RPC definitions
  2. Most master-only property keys are removed. All user-side property keys are kept and belong to Remove the usage of must_cache and async_cache #17963. Some exception are keys referred by client/worker/common code.
  3. A few master-only metrics are removed. Many metrics like MASTER_ASYNC_PERSIST_SUCCESS are actually utilized by DistributedCmdMetrics and therefore is not removed here.
  4. Async-persist logic exist in some methods in InodeTree and journal. Those are not removed because the implications are a bit complicated and belong to a separate PR.

Why are the changes needed?

ASYNC_THROUGH and FileSystemMaster are both going to be removed. This is one step towards that.

Does this PR introduce any user facing changes?

jianghuazhu and others added 30 commits June 30, 2023 13:49
…[DOCFIX] Fix some descriptions related to the FileSystemMaster module The purpose of this pr is to fix some descriptions related to the FileSystemMaster module. In the FileSystemMaster module, there are some inappropriate descriptions or comments, and we should fix them as much as possible. Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17215	change-id: cid-361c9b28832a3797941987663080f7e2a57d4965
…adSleeper Implements light thread sleeper to support invoke a sleeping sleeper to determine whether need continue to sleep. Without this feature, when we need to refresh the interval of Heartbeat thread, it cannot take effect. No	pr-link: Alluxio#17298	change-id: cid-a4c471eb06e6389868278fab088556c8d7a23986
### What changes are proposed in this pull request? Calls shutdown on the executor service in the S3 UFS class. ### Why are the changes needed? If shutdown is not called the threads will not be garbage collected. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#15748	change-id: cid-86d16e80118882044bf529a619b7915c8451eb03
### What changes are proposed in this pull request? Fix deadlock issue when master process exit ### Why are the changes needed? ``` stackTrace: java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Shutdown.exit(java.base@11.0.12-ga/Shutdown.java:173) - **waiting to lock <0x00007f55e98d1920> (a java.lang.Class for java.lang.Shutdown)** at java.lang.Runtime.exit(java.base@11.0.12-ga/Runtime.java:116) at java.lang.System.exit(java.base@11.0.12-ga/System.java:1752) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:83) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:63) at alluxio.master.journal.MasterJournalContext.waitForJournalFlush(MasterJournalContext.java:99) at alluxio.master.journal.MasterJournalContext.close(MasterJournalContext.java:109) - locked <0x00007f5602000010> (a alluxio.master.journal.MasterJournalContext) at alluxio.master.journal.StateChangeJournalContext.close(StateChangeJournalContext.java:55) at alluxio.master.block.DefaultBlockMaster.getNewContainerId(DefaultBlockMaster.java:906) - locked <0x00007f594a000298> (a alluxio.master.block.BlockContainerIdGenerator) at alluxio.master.file.meta.InodeDirectoryIdGenerator.initialize(InodeDirectoryIdGenerator.java:83) at alluxio.master.file.meta.InodeDirectoryIdGenerator.getNewDirectoryId(InodeDirectoryIdGenerator.java:57) - **locked <0x00007f55e197de68> (a alluxio.master.file.meta.InodeDirectoryIdGenerator)** at alluxio.master.file.meta.InodeTree.createPath(InodeTree.java:979) at alluxio.master.file.DefaultFileSystemMaster.createDirectoryInternal(DefaultFileSystemMaster.java:2746) at alluxio.master.file.InodeSyncStream.loadDirectoryMetadataInternal(InodeSyncStream.java:1374) at alluxio.master.file.InodeSyncStream.loadDirectoryMetadata(InodeSyncStream.java:1294) at alluxio.master.file.TxInodeSyncStream.concurrentLoadMetadata(TxInodeSyncStream.java:123) at alluxio.master.file.TxInodeSyncStream.loadMetadataForPath(TxInodeSyncStream.java:98) at alluxio.master.file.InodeSyncStream.syncInodeMetadata(InodeSyncStream.java:743) at alluxio.master.file.InodeSyncStream.syncInternal(InodeSyncStream.java:491) at alluxio.master.file.InodeSyncStream.sync(InodeSyncStream.java:409) at alluxio.master.file.DefaultFileSystemMaster.syncMetadata(DefaultFileSystemMaster.java:4075) at alluxio.master.file.TxFileSystemMaster.syncMetadata(TxFileSystemMaster.java:298) at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1111) at alluxio.master.file.TxFileSystemMaster.listStatus(TxFileSystemMaster.java:827) ... ``` ``` stackTrace: java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@11.0.12-ga/Native Method) - waiting on <no object reference available> at java.lang.Thread.join(java.base@11.0.12-ga/Thread.java:1300) - waiting to re-lock in wait() <0x00007f55e98d0918> (a java.lang.Thread) at java.lang.Thread.join(java.base@11.0.12-ga/Thread.java:1375) at java.lang.ApplicationShutdownHooks.runHooks(java.base@11.0.12-ga/ApplicationShutdownHooks.java:107) at java.lang.ApplicationShutdownHooks$1.run(java.base@11.0.12-ga/ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(java.base@11.0.12-ga/Shutdown.java:130) at java.lang.Shutdown.exit(java.base@11.0.12-ga/Shutdown.java:174) - locked <0x00007f55e98d1920> (a java.lang.Class for java.lang.Shutdown) at java.lang.Runtime.exit(java.base@11.0.12-ga/Runtime.java:116) at java.lang.System.exit(java.base@11.0.12-ga/System.java:1752) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:83) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:63) at alluxio.master.journal.MasterJournalContext.waitForJournalFlush(MasterJournalContext.java:99) at alluxio.master.journal.MasterJournalContext.close(MasterJournalContext.java:109) - locked <0x00007f5c0a60bda0> (a alluxio.master.journal.MasterJournalContext) at alluxio.master.journal.StateChangeJournalContext.close(StateChangeJournalContext.java:55) at alluxio.master.journal.FileSystemMergeJournalContext.close(FileSystemMergeJournalContext.java:90) - locked <0x00007f5c0a60bde0> (a alluxio.master.journal.FileSystemMergeJournalContext) at alluxio.master.file.RpcContext.closeQuietly(RpcContext.java:141) at alluxio.master.file.RpcContext.close(RpcContext.java:129) at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1227) at alluxio.master.file.TxFileSystemMaster.listStatus(TxFileSystemMaster.java:827) ``` ``` java.lang.Thread.State: BLOCKED (on object monitor) at alluxio.master.file.meta.InodeDirectoryIdGenerator.peekDirectoryId(InodeDirectoryIdGenerator.java:76) - **waiting to lock <0x00007f55e197de68> (a alluxio.master.file.meta.InodeDirectoryIdGenerator)** at alluxio.master.file.DefaultFileSystemMaster.stop(DefaultFileSystemMaster.java:787) at alluxio.master.file.TxFileSystemMaster.stop(TxFileSystemMaster.java:245) at alluxio.master.AbstractMaster.close(AbstractMaster.java:156) at alluxio.master.file.DefaultFileSystemMaster.close(DefaultFileSystemMaster.java:800) at alluxio.master.file.TxFileSystemMaster.close(TxFileSystemMaster.java:581) at alluxio.Registry.close(Registry.java:156) at alluxio.master.AlluxioMasterProcess.stop(AlluxioMasterProcess.java:412) - locked <0x00007f55e51e2ba8> (a java.util.concurrent.atomic.AtomicBoolean) at alluxio.ProcessUtils.lambda$stopProcessOnShutdown$0(ProcessUtils.java:98) at alluxio.ProcessUtils$$Lambda$363/0x00007f54f02dc840.run(Unknown Source) at java.lang.Thread.run(java.base@11.0.12-ga/Thread.java:829) ``` The blocked listStatus thread(called thread1) wait to get lock <0x00007f55e98d1920> , while it is owned by another wating listStatus thread(called thread2) which also want to exit, thread2 wait the hook process(alluxio-process-shutdown-hook) finished and then continue exit. The alluxio-process-shutdown-hook wait to get lock <0x00007f55e197de68>, which is owned by thread1. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17628	change-id: cid-d077517ba20445ebe95520a1710c173b41810f6d
### What changes are proposed in this pull request? Use ufs iterable for copy and move ### Why are the changes needed? save memory ### Does this PR introduce any user facing changes? na Test with distCp & s3	pr-link: Alluxio#17719	change-id: cid-9532c9487d3d55b113e01aa2385459b5a7501562
## Background ## We found that there is a bug that worker doesn't check the size when writing temp pages. This leads the fact that the pages' size stored in Alluxio Worker exceeds the capacity limitation, as shown in the following snapshot. This makes worker easy to crash. <img width="1785" alt="image" src="https://github.com/Alluxio/alluxio/assets/6129818/5e53f928-732c-43e6-b55f-3bd4855e8031"> This PR fixes this bug by checking the temp page size to write.	pr-link: Alluxio#17711	change-id: cid-b472a6b368f311e64fae68231366497fa0ba00ad
### What changes are proposed in this pull request? This change removed the worker-side implementation of the Block API, i.e. how a file is managed in the unit of blocks and how the I/O and management go. Needless to mention, the corresponding UT and integration tests are removed with their functionalities. ### Removed functionalities Instead of listing the removed code classes one by one, corresponding functionalities will be listed here for future reference. 1. All Allocator & Evictor & Reviewer classes and implementations are removed. Those class specify how a block is allocated and evicted. Those include all code under `worker/block/allocator/`, `worker/block/evictor/` and `worker/block/reviewer/`. 2. All tiered storage related code are removed, including the logic to sort the blocks between tiers and to move them between tiers. Those code typically reside in `worker/block/annotator/` and `worker/block/management/`. Tiered store representations like `StorageDir` are also removed. Finally, `TieredBlockStore` is removed. 3. New BlockStore implementations like MonoBlockStore, UnderFileSystemBlockStore and LocalBlockStore are removed. Those implementations were added late in the 2.x line, to bring in Page API under the Block interface. PagedBlockStore is also removed. PagedBlockStore is an intermediate solution to convert Page to Block. That is no longer necessary because we now directly use Page. All code relevant to the page-block conversion, and the wrapped PagedBlockStore reader/writer are removed altogether, which typically belong to `worker/page/`. `underfs/PagedUfsReader` is removed too. 4. Metadata representations related to those feature are removed, together with their tests. Those logic are typically in `worker/block/meta`. 5. BlockReaderFactory/BlockWriterFactory and their children like TieredBlockReaderFactory are removed. 6. Block information view representations are removed. Some examples are BlockMetadataView and BlockMetadataManager. 7. Worker-side register and heartbeat logic is removed. This includes a normal register process like BlockMasterSync. This also includes special register logic like registering to all masters in HA. This includes reporting metrics and block updates in a heartbeat. One exception is some register logic are still used in PagedDoraWorker so will be kept around until a refactor purges the rest of those. PinList sync is removed too. 8. Worker-side block read and write handlers are removed. This includes client I/O handlers like BlockReadHandler, ShortCircuitBlockReadHandler, UfsFallbackBlockReadHandler in `worker/grpc` and `worker/netty` and `worker/block/io`. 9. Worker-side block reader and writers are removed. This part specifies how a worker read/write from a remote worker or UFS. Some examples are `worker/block/RemoteBlockReader` and `worker/block/UnderFileSystemBlockReader`. 10. Some request definitions and their RPC contexts are removed. For example, `BlockWriteRequest` and `BlockWriteRequestContext`. 11. Async caching logic is removed, together with the worker logic to handle async cache request. 12. Async block removal logic is removed. 13. Block locking lock management logic is removed. 14. Stress testing on worker register/heartbeat logic are removed. 15. General client block service handler BlockWorkerClientServiceHandler is removed. 16. REST service handler Alluxio AlluxioWorkerRestServiceHandler is removed. That means the worker is no longer serving REST requests. We need to implement new REST service definitions because that feature is still relevant. 17. DefaultBlockWorker is removed, together with its constructor class BlockWorkerFactory and BlockWorkerModule. 18. The old `worker/block/FuseManager` is removed. 19. Worker microbench tests are removed. ### Exceptions We try really hard NOT to modify existing code, because this PR is very hard to review and discuss on. Most of the switch changes are embedded in Alluxio@3febe93 If a change needs discussion, that will be postponed to a separate PR. We had to keep some classes around because it is hard to remove it without incurring extra code changes. Generally speaking, a piece of code is not removed if: 1. It is not in the worker module. For example, the code is under `alluxio-core-common` or `alluxio-server-common`. The code is there because it is used by the master/client side. So it's hard to remove it now. 2. It requires extra code change to remove. We don't want extra code changes in this PR so we don't do it now. For example, `PagedDoraWorker` incorrectly reused some register code, so we keep those register code until the next refactor. For many such cases, `TODO` are added explaining why a class is not removed, in Alluxio@8c20387. But note there may be a lot of overlooks. ### Removed integration tests Many integration tests are removed. They became irrelevant because they are mainly targeting the Block API. However, the functionality behind might still bear a little value, meaning the tests cases might be worth another look in the future. They are: ``` dora/tests/src/test/java/alluxio/server/tieredstore/CapacityUsageIntegrationTest.java dora/tests/src/test/java/alluxio/server/tieredstore/SpecificTierWriteIntegrationTest.java dora/tests/src/test/java/alluxio/server/tieredstore/TierPromoteIntegrationTest.java dora/tests/src/test/java/alluxio/server/tieredstore/TieredStoreIntegrationTest.java dora/core/server/worker/src/test/java/alluxio/worker/block/DefaultBlockWorkerExceptionTest.java dora/core/server/worker/src/test/java/alluxio/worker/block/DefaultBlockWorkerTest.java dora/core/server/worker/src/test/java/alluxio/worker/block/DefaultBlockWorkerTestBase.java dora/core/server/worker/src/test/java/alluxio/worker/block/stream/BlockWorkerDataReaderTest.java dora/tests/src/test/java/alluxio/client/fs/FreeAndDeleteIntegrationTest.java dora/tests/src/test/java/alluxio/server/tieredstore/LostStorageIntegrationTest.java ``` ### Things we know are broken by this removal 1. `FuseManager` relies totally on Block API. With that removed, fuse on worker is broken. 2. The Rest service handlers are totally on Block API. With that removed, REST on worker is broken. ### Why are the changes needed? The Block API is obsolete. Keeping that around only makes it harder to develop new features.	pr-link: Alluxio#17638	change-id: cid-22164a4415a945ae10cf3778ca57f65c8510ebe6
### What changes are proposed in this pull request? Merge missing commits from master-2.x to main. The commits in 2023/05/01~2023/05/31 from Alluxio/alluxio@main...master-2.x will be included by this PR. Some commits are skipped because they have been manually ported to `main` Alluxio@4a57bab Alluxio@a84b6e6 Alluxio@ec066dc This commit Alluxio@11af4ee is not a cherry-pick but a missed out clean up work for Alluxio#17638, where a test should be removed together with the Block API.	pr-link: Alluxio#17717	change-id: cid-68bc4046c7c32294075363d2bbc62a3daef11418
address corner case when rerunning the copy operation of a directory results in worker log error: ``` 2023-07-02 04:28:45,565 ERROR PagedDoraWorker - Failed to move s3://jul02-vvrz-1/Compatibility-TestTool-1.1-alpha/lib to s3://jul02-vvrz-0/TEST1/lib alluxio.exception.runtime.FailedPreconditionRuntimeException: File s3://jul02-vvrz-0/TEST1/lib is already in UFS	at alluxio.worker.task.CopyHandler.copy(CopyHandler.java:85)	at alluxio.worker.dora.PagedDoraWorker.lambda$move$11(PagedDoraWorker.java:631)	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)	at alluxio.worker.grpc.GrpcExecutors$ImpersonateThreadPoolExecutor.lambda$execute$0(GrpcExecutors.java:180)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)	at java.base/java.lang.Thread.run(Thread.java:829) ```	pr-link: Alluxio#17728	change-id: cid-f2013893d5fd9f19a0a338235d93401bfac33adc
### What changes are proposed in this pull request? Fix typo.	pr-link: Alluxio#17557	change-id: cid-01eb9607bab151395e5a54c1c742f93f03b41823
less ambiguous to define the dst path from project root rather than from module root. the lib/ directory is more simple to define as `build.path/../lib/`, rather than `module.root/../<some variable number of parent directories>/lib/`	pr-link: Alluxio#17730	change-id: cid-4642a5d6b464b1dba06c5410ca6e482b453005e6
### What changes are proposed in this pull request? Instead of comparing the fingerprint, we compare the fields in the file info object directly, to make it more efficient. ### Why are the changes needed? In Alluxio#17458, we introduced a mechanism that populates fingerprint and persists that into metastore so that we can skip invalidating unnecessary page caches when the metadata is updated. Such mechanism is protected under a property key as it does not perform good. Now after the discussion with @dbw9580 , we came up with a more efficient way where we just leverage the fields in the file info object and we believe we are able to get rid of such switch. ### Does this PR introduce any user facing changes? N/A	pr-link: Alluxio#17725	change-id: cid-c0e7898b674cdfc5094e5eb1ee996b3a580c3bf3
### What changes are proposed in this pull request? Support fallback in position read ### Why are the changes needed? client can fallback to ufs when anything wrong in cache ### Does this PR introduce any user facing changes? no	pr-link: Alluxio#17721	change-id: cid-9aa6105fbe41086ba6ecd59e99efa263b5b53699
### What changes are proposed in this pull request? Support position reader of OBS. ### Why are the changes needed? Alluxio couldn't read data in OBS. I have implemented OBSPositionReader and now it can `cat` files in OBS. Here is the screenshot about OBS's position reader. <img width="812" alt="截屏2023-06-27 22 15 24" src="https://github.com/Alluxio/alluxio/assets/57146148/c70f93be-b8d8-4ddb-b8ba-91003e7ccf9b"> ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17686	change-id: cid-0d1c75c997b3e98257119a248e2fda73240d4481
### What changes are proposed in this pull request? 1. Move metastore into dora meta manager 2. Handles listing cache invalidation 3. Add unit tests 4. Introduced caffeinine cache ### Why are the changes needed? Dora metadata management has a lot of consistency issue between metadata/listing/page cahces. To address these issues, the first step is to introduce a centralized management place and put these logic together. ### Does this PR introduce any user facing changes? N/A	pr-link: Alluxio#17710	change-id: cid-740862f412ebfea9df7d26a81ee4d6715aff5b69
### What changes are proposed in this pull request? Support Position Reader of GCS (Google Cloud Storage, Goole Object Storage Service). ### Why are the changes needed? Alluxio's Position Reader of GCS was not implemented before. This PR has supported position reader of GCS. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#17708	change-id: cid-9607f7d15602eb4a9d44e4c17c4b78b7404a1a02
### What changes are proposed in this pull request? [DOCFIX] Add multipart-upload method ### Why are the changes needed? 1. Add multipart upload method in S3, OSS, COS and OBS. 2. Add configure COS and OBS with Alluxio. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17705	change-id: cid-d9f9b09c2d8e9a240a640363e28547d831c4d028
### What changes are proposed in this pull request? Fix to support alluxio version other than xxx.yyy ### Why are the changes needed? Without these changes, users from alluxio open source community who want to build a package with their version name under their community version name rule, but it doesn't contains any `.` within version name, it wouldn't start alluxio cluster successfully. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17716	change-id: cid-925efaceea9d7cd8e1386411532f8816c7a8b90e
define new property for configurability of update checker, to be used in conjunction with the update checker enabled property. previously the enabled property has its default value set as a build time constant, but the value could be overridden if set in configuration. now the configurability value, which is set at build time, determines if the enabled property can be configured, and if it is, uses its corresponding value. this pattern allows us to lock the enabled value at build time such that a published jar cannot override this enabled property.	pr-link: Alluxio#17738	change-id: cid-5e07f324674ebf6ebd395e6a4c297655dd633bce
### What changes are proposed in this pull request? Using the content hash of the file to be the ETag in the s3 api. Add a property key to decide whether to get real content hash when loading file meta from ufs. ### Why are the changes needed? Now we no longer use Xattr to store the etag of the file. And alluxio adds a new option to get real content hash. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17742	change-id: cid-b91ae9871b9408c5e111668bebb5a398852c229c
### What changes are proposed in this pull request? The delete/create options are currently not passed to the UFS. By passing them to the UFS correctly, we should be able to achieve recursive create/delete. However, note that under Dora structure, when we delete `/a/`, we can't also remove all `/a/*` cache from all workers immediately. Alluxio#17741	pr-link: Alluxio#17724	change-id: cid-fc5550beec4c9425505f624576b7580a9b171996
and remove the --disableTelemetry flag readded the enabled config to set the default value follow up to Alluxio#17738	pr-link: Alluxio#17740	change-id: cid-ee7e1bf3d3f8949249b68374c2d20f54261b6b4d
Fix fuse doc picture	pr-link: Alluxio#17744	change-id: cid-7e8abc9afacec5985b6159e0cbcdf73a92177c3c
Support list files through RESTful API Usage example: HTTP GET request http://localhost:28080/files?path=/test `ls` files of the path `/test` and response a JSON string <img width="793" alt="image" src="https://github.com/Alluxio/alluxio/assets/6129818/96b2d98b-1ac2-4014-b417-a92611b55381">	pr-link: Alluxio#17685	change-id: cid-d6e4b80fdebb94684254f4239a4a65d65dd81a5c
The right side bar will stay at the same position while scrolling. example without actually format: ![image](https://github.com/Alluxio/alluxio/assets/107361923/eba16e35-a180-4734-80ee-6d31acc360a2) with actually format: ![image](https://github.com/Alluxio/alluxio/assets/107361923/01d62af9-7b93-4676-b82f-599a65722f14)	pr-link: Alluxio#17749	change-id: cid-240f63f35d9b18183833eb93b8f8568ef33a55c0
### What changes are proposed in this pull request? I'm trying to rollback to java 8 to see if it's compatible to prestodb and hive ### Why are the changes needed? Please clarify why the changes are needed. For instance, 1. If you propose a new API, clarify the use case for a new API. 2. If you fix a bug, describe the bug. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17751	change-id: cid-ef100538c3abd0a5e3d74be030defc723a5c8cef
make a differentiation between a title w/ subfiles vs a title with a single file. if it's only a single file, then create the button link instead of the dropdown style. This pr is related change based on Alluxio#17737 example without format: ![image](https://github.com/Alluxio/alluxio/assets/107361923/4d53fa9c-ec42-43df-876f-42316ef7b4f1) with format: ![image](https://github.com/Alluxio/alluxio/assets/107361923/36ae2846-9429-43a8-9f4f-cd4d98d30ef7)	pr-link: Alluxio#17760	change-id: cid-8c64be79e6e025407145999564eca28290e4642a
### What changes are proposed in this pull request? A new file `CreateBucketTest.java` was created to run putting bucket UTs for S3 proxy based on Dora version. ### Why are the changes needed? The original s3 API unit tests were very messy and irregular. It is difficult for us to confirm which testing scenarios we have correctly covered. We want to set up s3mock as the UFS and reorganize the unit test for S3 proxy based on Dora version. ### Does this PR introduce any user facing changes? none.	pr-link: Alluxio#17732	change-id: cid-71c55b0633c197d2effc919668cb7b51d5b32648
Xenorith and others added 25 commits August 17, 2023 02:39
directly call `fs cp` instead of the deleted `copyToLocal` and `copyFromLocal`	pr-link: Alluxio#18013	change-id: cid-30a30d0f76dd64b30b750c98e50383e0f6eaf29c
### What changes are proposed in this pull request? Remove MetaMasterConfigClient initialization in the client. ### Why are the changes needed? Remove MetaMasterConfigClient initialization in the client. ### Does this PR introduce any user facing changes? NA	pr-link: Alluxio#18012	change-id: cid-3a82b8c04ad0d3dbca77cf35ee8e0b2bf8832b8c
### What changes are proposed in this pull request? Prepare the worker for accessing multiple UFS's as the client requests ### Why are the changes needed? Allow worker to access arbitrary UFS's in the future. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#17839	change-id: cid-569ed16a888f2a576bdfd69a22cb587989ff3559
Additional `exec`,`fs`,`init`,`info` commands to golang CLI as part of Alluxio#17522 `bin/alluxio-bash runUfsIOTest` -> `bin/alluxio exec ufsIOTest` `bin/alluxio-bash fs chgrp` -> `bin/alluxio fs chgrp` `bin/alluxio-bash fs chmod` -> `bin/alluxio fs chmod` `bin/alluxio-bash fs chown` -> `bin/alluxio fs chown` `bin/alluxio-bash fsadmin metrics clear` -> `bin/alluxio init clearMetrics` `bin/alluxio-bash clearCache` -> `bin/alluxio init clearOSCache` `bin/alluxio-bash validateConf` -> `bin/alluxio init validate --type conf` `bin/alluxio-bash validateEnv` -> `bin/alluxio init validate --type env` `bin/alluxio-bash fsadmin doctor` -> `bin/alluxio info doctor` `bin/alluxio-bash fsadmin nodes` -> `bin/alluxio info nodes`	pr-link: Alluxio#18007	change-id: cid-fcbc6fc230a4a3b8802bee748585469b2cd35309
### What changes are proposed in this pull request? Update main nav menu to 3 tiers Rename Basic Logging to Logging & update respective link paths Pull Glossary to level 1 ### Why are the changes needed? restructuring menu ### Does this PR introduce any user facing changes? webui	pr-link: Alluxio#17986	change-id: cid-9d6ccd0b2a94f930b611adb46a4ace88a80acdc4
### What changes are proposed in this pull request? Remove needsync commands. ### Why are the changes needed? Remove needsync commands. ### Does this PR introduce any user facing changes? NA	pr-link: Alluxio#17962	change-id: cid-38d2f6c311c5b56ddbd16d34ea8d8c2ac344efd9
### What changes are proposed in this pull request? Store workerinfo as json format on etcd for EtcdMembershipManager. ### Why are the changes needed? For cross-language worker membership retrieval. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#17959	change-id: cid-f153d7d2033568a26c9f3d35b1c3106e2da2957c
### What changes are proposed in this pull request? Disable proxy and job services to start by default ### Why are the changes needed? Proxy (standalone), job services are deprecated in 3.0 ### Does this PR introduce any user facing changes? When launching Alluxio, by default no job service or proxy processes will be started	pr-link: Alluxio#17993	change-id: cid-7de67e836fffe6b91a571c46775bb70d3d41d8fa
- add `metadata load` command - remove `init clearMetrics` because it definitely will not work as it calls a removed BlockClient rpc - fix error in `init validate`	pr-link: Alluxio#18016	change-id: cid-fc4803e9eb70253f6f3c4a1fafe959c1d0a02c59
### What changes are proposed in this pull request? Please outline the changes and how this PR fixes the issue. ### Why are the changes needed? Please clarify why the changes are needed. For instance, 1. If you propose a new API, clarify the use case for a new API. 2. If you fix a bug, describe the bug. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#18024	change-id: cid-4f853a31a2d8b190c7e7a02df01e7b2c6a193101
### What changes are proposed in this pull request? Please outline the changes and how this PR fixes the issue. ### Why are the changes needed? Please clarify why the changes are needed. For instance, 1. If you propose a new API, clarify the use case for a new API. 2. If you fix a bug, describe the bug. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#18027	change-id: cid-e1d2500c2d5116ecbeb109275768a8eb029efa0f
### What changes are proposed in this pull request? Remove loadMetadataCommand. ### Why are the changes needed? Remove loadMetadataCommand. It is not used in new arch. ### Does this PR introduce any user facing changes? NA	pr-link: Alluxio#18020	change-id: cid-58a575037eab3040a0617015aed0b8eed33e1921
loadMetadataCommand was deleted in Alluxio#18020 so remove the go cli code calling it. users should be using the load command from the job service with the --metadata-only flag to perform the same action, to be restored in a subsequent PR	pr-link: Alluxio#18030	change-id: cid-b209c60651bf7045c7065e291543179e3f967b76
### What changes are proposed in this pull request? Update headers so that once it's clicked in the right side menu, it redirects to its section, instead of external aws link ### Why are the changes needed? menu should redirect to section ### Does this PR introduce any user facing changes? n/a	pr-link: Alluxio#18028	change-id: cid-9020690c793a43b829db56e6cbd1a0b891392a34
follow up to Alluxio#17993 on the stop process bash script side. golang code doesn't have an issue because it has a single source of truth	pr-link: Alluxio#18018	change-id: cid-ae89f6d558bfbdc206422a4ecb05c8f3ba549edc
in the golang cli build script, check that - `go` is accessible - go version is 1.18+ - run `go mod tidy` which should be a no-op command normally, but if it's the first time building, it will download some native packages also update the compile requirements in docs	pr-link: Alluxio#18031	change-id: cid-7efc8c8cc4c470b663bbdf267ce7b5823a58d5c9
### What changes are proposed in this pull request? Add metrics for the ages of the pages in page cache store ### Why are the changes needed? To make sure we're not keep the data in cache too long ### Does this PR introduce any user facing changes? no	pr-link: Alluxio#18035	change-id: cid-36d458b68e059ce8cb934db7459bf4af3f298b93
### What changes are proposed in this pull request? The LsCommand displays DataInAlluxioPercentage, but it always shows 0% which is misleading. This change changes the DataInAlluxioPercentage column to a fix string FILE in the Lscommand. ### Why are the changes needed? Please clarify why the changes are needed. For instance, As mentioned above, it fixes the bug that LsCommand always shows 0% for DataInAlluxio. ### Does this PR introduce any user facing changes? Yes, it will change the display of LsCommand. It will change the output of LsCommand for files but not directories. For files, it will be the follow format ``` ./alluxio fs ls /file_path FileSize PeristenceState Timestamp FILE FilePath ``` Please list the user-facing changes introduced by your change, including It changes the percentage of data in Alluxio to a fix string FILE as shown above.	pr-link: Alluxio#18009	change-id: cid-8b50cec273f5c8edbe7fa14640c59bd41a431e42
### What changes are proposed in this pull request? - Change the serialize/deserialize interface for future use(Alluxio#17959) - Expose the TTL and timeout of the lease used in `ServiceDiscoveryRecipe` - Expose the path prefix of the `ServiceDiscoveryRecipe` for future use ### Why are the changes needed? etcd will be used in the near future for more areas than worker service discovery. so expose more information for future use. ### Does this PR introduce any user facing changes? nope	pr-link: Alluxio#18010	change-id: cid-7cc0b9f5cfaee860e587941c37867797b7fe7bdb
Additional `init` commands to golang CLI as part of Alluxio#17522 `bin/alluxio-bash format` -> `bin/alluxio init format` `bin/alluxio-bash copyDir` -> `bin/alluxio init copyDir`	pr-link: Alluxio#18040	change-id: cid-511177a75c48222ff7e0c4122cb2aa17a9f86576
### What changes are proposed in this pull request? 1. fix bugs in `alluxio.proxy.s3.S3ObjectTask.CompleteMultipartUploadTask#validateParts` 2. `CompleteMultipartUpload` executes the specified parts with the specified upload ID rather than all parts. 3. add `initiate / complete / abort MPU and upload part` unit tests. 4. remove `XAttr` 5. show all upload-parts to user while listing parts ### Why are the changes needed? there are problems in validateParts() of complete MPU 1. can’t recognize S3 Error: MalformedXML 3. throw unexpected error: InvalidPartOrder we need to fix validateParts() of complete MPU to keep s3 proxy behaviors as consistent with aws as possible. ### Does this PR introduce any user facing changes? - can't get the bucket name and object name of the part while executing ListMultipartUploads - now complete MPU's response is like |part list|response(before)|response(now)| |-|-|-| | [] | 500 |MalformedXML| |[1,3]| InvalidPartOrder | 200 | |[2,1]| InvalidPartOrder |InvalidPartOrder| |[3,3]| InvalidPartOrder |InvalidPartOrder|	pr-link: Alluxio#18046	change-id: cid-2d5dbc5491e5d54ca48d20a230a3b0b8dbf4ee97
### What changes are proposed in this pull request? Fix multiple mount options specified by `-o` not recognized by Alluxio Fuse. The `-o` option can be specified multiple times, and each time it can take a comma separated list of `key=value` mount options. ### Why are the changes needed? Fuse mounting with `bin/alluxio-fuse mount hdfs://10.10.1.2:9000/ /work/alluxio_fuse -o kernel_cache -o attr_timeout=6000 -o entry_timeout=6000` errors with ``` Exception in thread "main" com.beust.jcommander.ParameterException: "-o": couldn't convert "kernel_cache,attr_timeout=6000,entry_timeout=6000" to a `key=value` pair because contains more than 1 `=`	at alluxio.fuse.options.MountCliOptions$KvPairsConverter.convert(MountCliOptions.java:74)	at alluxio.fuse.options.MountCliOptions$KvPairsConverter.convert(MountCliOptions.java:50)	at com.beust.jcommander.JCommander.convertValue(JCommander.java:1333)	at com.beust.jcommander.ParameterDescription.addValue(ParameterDescription.java:249) ``` The bash scripts concatenates the multiple occurrences of `-o` into a comma-separated list and passes it to the Java program. The PR does the other way around, preserving them and splitting the kv pair list in a `-o` option into multiple options. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#18026	change-id: cid-d9b5dcf65ba3de1bd9324ed01f74944171d5ef30
### What changes are proposed in this pull request? Return relative path in listStatus results. ### Why are the changes needed? After Alluxio#17839 the worker only receives and sends absolute UFS paths. This change convert them back to Alluxio relative path according to client's perspective about the UFS. ### Does this PR introduce any user facing changes? No.	pr-link: Alluxio#18014	change-id: cid-c874c95cb9331c59de7be7e66d3cc9cc0e1a653f
- remove job subcommands that were referring to the old job service - update `load` subcommand and add necessary flags - remove checks in the java class that referred to the old load command	pr-link: Alluxio#18045	change-id: cid-31fce5258be608e7d34648e07940857d30340c60
Copy link
Copy Markdown
Contributor

@jja725 jja725 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the clean up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment