Skip to content

Merge master-2.x commits 2023/06/01~2023/06/30 into main#17717

Merged
alluxio-bot merged 18 commits intoAlluxio:mainfrom
jiacheliu3:merge-202306
Jul 3, 2023
Merged

Merge master-2.x commits 2023/06/01~2023/06/30 into main#17717
alluxio-bot merged 18 commits intoAlluxio:mainfrom
jiacheliu3:merge-202306

Conversation

@jiacheliu3
Copy link
Copy Markdown
Contributor

@jiacheliu3 jiacheliu3 commented Jun 30, 2023

What changes are proposed in this pull request?

Merge missing commits from master-2.x to main. The commits in 2023/05/01~2023/05/31 from main...master-2.x will be included by this PR.

Some commits are skipped because they have been manually ported to main
4a57bab
a84b6e6
ec066dc

This commit 11af4ee is not a cherry-pick but a missed out clean up work for #17638, where a test should be removed together with the Block API.

secfree and others added 16 commits June 30, 2023 13:23
Fix LocalPageStore NPE. I encountered the following exception while using local cache ``` 2023-05-31T17:25:13.453+0800 ERROR 20230531_092513_00010_uqx2a.1.0.0-7-153 alluxio.client.file.cache.NoExceptionCacheManager Failed to put page PageId{FileId=76f9c79d5d43c725de31295c263291e0, PageIndex=534}, cacheContext CacheContext{cacheIdentifier=null, cacheQuota=alluxio.client.quota.CacheQuota$1@1f, cacheScope=CacheScope{id=.}, hiveCacheContext=null, isTemporary=false} java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because the return value of "java.lang.Exception.getMessage()" is null at alluxio.client.file.cache.store.LocalPageStore.put(LocalPageStore.java:80) at alluxio.client.file.cache.LocalCacheManager.putAttempt(LocalCacheManager.java:345) at alluxio.client.file.cache.LocalCacheManager.putInternal(LocalCacheManager.java:274) at alluxio.client.file.cache.LocalCacheManager.put(LocalCacheManager.java:234) at alluxio.client.file.cache.CacheManagerWithShadowCache.put(CacheManagerWithShadowCache.java:52) at alluxio.client.file.cache.NoExceptionCacheManager.put(NoExceptionCacheManager.java:55) at alluxio.client.file.cache.CacheManager.put(CacheManager.java:196) at alluxio.client.file.cache.LocalCacheFileInStream.localCachedRead(LocalCacheFileInStream.java:218) at alluxio.client.file.cache.LocalCacheFileInStream.bufferedRead(LocalCacheFileInStream.java:144) at alluxio.client.file.cache.LocalCacheFileInStream.readInternal(LocalCacheFileInStream.java:242) at alluxio.client.file.cache.LocalCacheFileInStream.positionedRead(LocalCacheFileInStream.java:287) at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:153) at alluxio.hadoop.HdfsFileInputStream.readFully(HdfsFileInputStream.java:170) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at io.trino.filesystem.hdfs.HdfsInput.readFully(HdfsInput.java:42) at io.trino.plugin.hive.parquet.TrinoParquetDataSource.readInternal(TrinoParquetDataSource.java:64) at io.trino.parquet.AbstractParquetDataSource.readFully(AbstractParquetDataSource.java:120) at io.trino.parquet.AbstractParquetDataSource$ReferenceCountedReader.read(AbstractParquetDataSource.java:330) at io.trino.parquet.ChunkReader.readUnchecked(ChunkReader.java:31) at io.trino.parquet.reader.ChunkedInputStream.readNextChunk(ChunkedInputStream.java:149) at io.trino.parquet.reader.ChunkedInputStream.read(ChunkedInputStream.java:93) ``` NO	pr-link: Alluxio#17552	change-id: cid-f35c64b837748c1d46ba7092dae5ad6ef5003bb7
### What changes are proposed in this pull request? [DOCFIX]Update cn version of docs/_data/table/cn/master-metrics.yml ### Why are the changes needed? The Chinese docs/_data/table/cn/master-metrics.yml doc is problematic: the description of Master.CreateDirectoryOps is wrong. This PR synchronizes these updates. ### Does this PR introduce any user facing changes? Developers can get to know Alluxio in Chinese easily.	pr-link: Alluxio#17011	change-id: cid-15f7792d87a6a9c2d537ce5ff82f697dc9b38117
[DOCFIX] Update cn version of OSS doc The Chinese ufs/OSS doc is not updated with the latest changes, this PR synchronizes these updates. Developers can get to know Alluxio in Chinese easily.	pr-link: Alluxio#16932	change-id: cid-fcfee42cb97534141c02c74b8011f31f015dce17
…n2 doc [DOCFIX] Update cn version of Azure-Data-Lake-Gen2 doc The Chinese ufs/Azure-Data-Lake-Gen2 doc is not updated with the latest changes, this PR synchronizes these updates. Developers can get to know Alluxio in Chinese easily.	pr-link: Alluxio#16929	change-id: cid-f0c0ac2323fda070c69a1a1b4ea0d7be76c4f0f3
[DOCFIX] Update cn version of Azure-Data-Lake doc Chinese reader could more easily understand Azure Data Lake. Developers can get to know Alluxio in Chinese easily.	pr-link: Alluxio#16921	change-id: cid-3b03ba284873b8d99e789be9acde4fc98028993b
[DOCFIX] Update cn version of Deep-Leaning doc The Chinese solutions/Deep-Leaning doc is not updated with the latest changes, this PR synchronizes these updates. Developers can get to know Alluxio in Chinese easily.	pr-link: Alluxio#16920	change-id: cid-c75d057724dd6cd9db3da4605acc7bd7c9ece702
### What changes are proposed in this pull request? The purpose of this pr is to fix some incorrect code in WorkerWebUILogs. ### Why are the changes needed? There is some incorrect code in WorkerWebUILogs, we should fix them. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17052	change-id: cid-47d099e6705ded625466492faa846a3ae6232cf1
### What changes are proposed in this pull request? When the standby master send a heartbeat request to the leader master, many times there is no trace. When necessary, we should record their correspondence. ### Why are the changes needed? Adding some logs is necessary, here are some reasons: 1. When the master node fails, logs can help troubleshoot the cause. 2. It can make the communication between the standby master and the leader master clearer. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17204	change-id: cid-ec012f3a6736d030f9f15a141254df8cf77d861b
### What changes are proposed in this pull request? The purpose of this pr is to improve the comments related to DefaultMetaMaster and add some @links. ### Why are the changes needed? In DefaultMetaMaster, some of the documentation descriptions are not clear enough, and we should fix them as much as possible. ### Does this PR introduce any user facing changes? Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17202	change-id: cid-3ca5c75c9374dfd5e92be806d115811a737c3aff
### What changes are proposed in this pull request? Remove table operations from user operation docs. ### Why are the changes needed? The SDS (table) service is deprecated. The link to `#table-operations` is dead. ### Does this PR introduce any user facing changes? Yes.	pr-link: Alluxio#17581	change-id: cid-ea608c71ef24c22da3399ce744c6937ccbc7eb45
…reads to fail Fix read failure when a mismatch occurs between the block size recorded in memory by `BlockMeta` and the length of the actual physical block file. When such a mismatch is detected, the block is removed from worker storage, and worker falls back to reading from UFS. This causes reading the block to fail. No.	pr-link: Alluxio#17564	change-id: cid-2758ff97e5016c1aae7ededb244e919233ae6d3b
In Section 6.4 Write to UFS Only (THROUGH), write completion and persistence are described in the wrong order. In the case of using UFS only, write should be completed after persistence to ensure that the written data will not be lost. Suggest to update the sentence ”This write type ensures that data will be persisted after the write completes“ to ”This write type ensures that data will be persisted before the write completes“	pr-link: Alluxio#17536	change-id: cid-76189e192afafd96afe410aeedec73eb65e3e161
…[DOCFIX] Fix some descriptions related to the FileSystemMaster module The purpose of this pr is to fix some descriptions related to the FileSystemMaster module. In the FileSystemMaster module, there are some inappropriate descriptions or comments, and we should fix them as much as possible. Please list the user-facing changes introduced by your change, including 1. change in user-facing APIs 2. addition or removal of property keys 3. webui	pr-link: Alluxio#17215	change-id: cid-361c9b28832a3797941987663080f7e2a57d4965
…adSleeper Implements light thread sleeper to support invoke a sleeping sleeper to determine whether need continue to sleep. Without this feature, when we need to refresh the interval of Heartbeat thread, it cannot take effect. No	pr-link: Alluxio#17298	change-id: cid-a4c471eb06e6389868278fab088556c8d7a23986
### What changes are proposed in this pull request? Calls shutdown on the executor service in the S3 UFS class. ### Why are the changes needed? If shutdown is not called the threads will not be garbage collected. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#15748	change-id: cid-86d16e80118882044bf529a619b7915c8451eb03
### What changes are proposed in this pull request? Fix deadlock issue when master process exit ### Why are the changes needed? ``` stackTrace: java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Shutdown.exit(java.base@11.0.12-ga/Shutdown.java:173) - **waiting to lock <0x00007f55e98d1920> (a java.lang.Class for java.lang.Shutdown)** at java.lang.Runtime.exit(java.base@11.0.12-ga/Runtime.java:116) at java.lang.System.exit(java.base@11.0.12-ga/System.java:1752) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:83) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:63) at alluxio.master.journal.MasterJournalContext.waitForJournalFlush(MasterJournalContext.java:99) at alluxio.master.journal.MasterJournalContext.close(MasterJournalContext.java:109) - locked <0x00007f5602000010> (a alluxio.master.journal.MasterJournalContext) at alluxio.master.journal.StateChangeJournalContext.close(StateChangeJournalContext.java:55) at alluxio.master.block.DefaultBlockMaster.getNewContainerId(DefaultBlockMaster.java:906) - locked <0x00007f594a000298> (a alluxio.master.block.BlockContainerIdGenerator) at alluxio.master.file.meta.InodeDirectoryIdGenerator.initialize(InodeDirectoryIdGenerator.java:83) at alluxio.master.file.meta.InodeDirectoryIdGenerator.getNewDirectoryId(InodeDirectoryIdGenerator.java:57) - **locked <0x00007f55e197de68> (a alluxio.master.file.meta.InodeDirectoryIdGenerator)** at alluxio.master.file.meta.InodeTree.createPath(InodeTree.java:979) at alluxio.master.file.DefaultFileSystemMaster.createDirectoryInternal(DefaultFileSystemMaster.java:2746) at alluxio.master.file.InodeSyncStream.loadDirectoryMetadataInternal(InodeSyncStream.java:1374) at alluxio.master.file.InodeSyncStream.loadDirectoryMetadata(InodeSyncStream.java:1294) at alluxio.master.file.TxInodeSyncStream.concurrentLoadMetadata(TxInodeSyncStream.java:123) at alluxio.master.file.TxInodeSyncStream.loadMetadataForPath(TxInodeSyncStream.java:98) at alluxio.master.file.InodeSyncStream.syncInodeMetadata(InodeSyncStream.java:743) at alluxio.master.file.InodeSyncStream.syncInternal(InodeSyncStream.java:491) at alluxio.master.file.InodeSyncStream.sync(InodeSyncStream.java:409) at alluxio.master.file.DefaultFileSystemMaster.syncMetadata(DefaultFileSystemMaster.java:4075) at alluxio.master.file.TxFileSystemMaster.syncMetadata(TxFileSystemMaster.java:298) at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1111) at alluxio.master.file.TxFileSystemMaster.listStatus(TxFileSystemMaster.java:827) ... ``` ``` stackTrace: java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@11.0.12-ga/Native Method) - waiting on <no object reference available> at java.lang.Thread.join(java.base@11.0.12-ga/Thread.java:1300) - waiting to re-lock in wait() <0x00007f55e98d0918> (a java.lang.Thread) at java.lang.Thread.join(java.base@11.0.12-ga/Thread.java:1375) at java.lang.ApplicationShutdownHooks.runHooks(java.base@11.0.12-ga/ApplicationShutdownHooks.java:107) at java.lang.ApplicationShutdownHooks$1.run(java.base@11.0.12-ga/ApplicationShutdownHooks.java:46) at java.lang.Shutdown.runHooks(java.base@11.0.12-ga/Shutdown.java:130) at java.lang.Shutdown.exit(java.base@11.0.12-ga/Shutdown.java:174) - locked <0x00007f55e98d1920> (a java.lang.Class for java.lang.Shutdown) at java.lang.Runtime.exit(java.base@11.0.12-ga/Runtime.java:116) at java.lang.System.exit(java.base@11.0.12-ga/System.java:1752) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:83) at alluxio.ProcessUtils.fatalError(ProcessUtils.java:63) at alluxio.master.journal.MasterJournalContext.waitForJournalFlush(MasterJournalContext.java:99) at alluxio.master.journal.MasterJournalContext.close(MasterJournalContext.java:109) - locked <0x00007f5c0a60bda0> (a alluxio.master.journal.MasterJournalContext) at alluxio.master.journal.StateChangeJournalContext.close(StateChangeJournalContext.java:55) at alluxio.master.journal.FileSystemMergeJournalContext.close(FileSystemMergeJournalContext.java:90) - locked <0x00007f5c0a60bde0> (a alluxio.master.journal.FileSystemMergeJournalContext) at alluxio.master.file.RpcContext.closeQuietly(RpcContext.java:141) at alluxio.master.file.RpcContext.close(RpcContext.java:129) at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1227) at alluxio.master.file.TxFileSystemMaster.listStatus(TxFileSystemMaster.java:827) ``` ``` java.lang.Thread.State: BLOCKED (on object monitor) at alluxio.master.file.meta.InodeDirectoryIdGenerator.peekDirectoryId(InodeDirectoryIdGenerator.java:76) - **waiting to lock <0x00007f55e197de68> (a alluxio.master.file.meta.InodeDirectoryIdGenerator)** at alluxio.master.file.DefaultFileSystemMaster.stop(DefaultFileSystemMaster.java:787) at alluxio.master.file.TxFileSystemMaster.stop(TxFileSystemMaster.java:245) at alluxio.master.AbstractMaster.close(AbstractMaster.java:156) at alluxio.master.file.DefaultFileSystemMaster.close(DefaultFileSystemMaster.java:800) at alluxio.master.file.TxFileSystemMaster.close(TxFileSystemMaster.java:581) at alluxio.Registry.close(Registry.java:156) at alluxio.master.AlluxioMasterProcess.stop(AlluxioMasterProcess.java:412) - locked <0x00007f55e51e2ba8> (a java.util.concurrent.atomic.AtomicBoolean) at alluxio.ProcessUtils.lambda$stopProcessOnShutdown$0(ProcessUtils.java:98) at alluxio.ProcessUtils$$Lambda$363/0x00007f54f02dc840.run(Unknown Source) at java.lang.Thread.run(java.base@11.0.12-ga/Thread.java:829) ``` The blocked listStatus thread(called thread1) wait to get lock <0x00007f55e98d1920> , while it is owned by another wating listStatus thread(called thread2) which also want to exit, thread2 wait the hook process(alluxio-process-shutdown-hook) finished and then continue exit. The alluxio-process-shutdown-hook wait to get lock <0x00007f55e197de68>, which is owned by thread1. ### Does this PR introduce any user facing changes? No	pr-link: Alluxio#17628	change-id: cid-d077517ba20445ebe95520a1710c173b41810f6d
@jiacheliu3 jiacheliu3 changed the title Merge 202306 Merge master-2.x commits 2023/06/01~2023/06/30 into main Jun 30, 2023
@jiacheliu3 jiacheliu3 requested review from dbw9580 and elega July 3, 2023 06:39
@jiacheliu3 jiacheliu3 added the type-feature This issue is a feature request label Jul 3, 2023
@jiacheliu3
Copy link
Copy Markdown
Contributor Author

alluxio-bot, sync-merge this please

@alluxio-bot alluxio-bot merged commit 93a008b into Alluxio:main Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-feature This issue is a feature request

10 participants