ModuleCache issues on Linux (Ubuntu)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ModuleCache issues on Linux (Ubuntu)

Deep Majumder via cfe-dev
Hello everyone!
I've ran into a strange problem using clang-modules on Linux. I have a Linux machine running Ubuntu 18.04.4 LTS.
When I put my ModuleCache on RAM everything is OK
When I put ModuleCache on SSD/HDD, very frequently I'm getting different errors with pcm files, like below:
```
fatal error: module file '/home/ilyakuteev/module_cache/2/1721172304169885905/3OA4HVRBFA8AJ/Base-5W2XL4QH4QAH.pcm' is out of date and needs to be rebuilt: signature mismatch
```

```
fatal error: malformed or corrupted AST file: 'Unexpected end of file reading 4154469704 of 4738416 bytes'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/1721172304169885905/37WO4K6LHZNB5'
fatal error: error in backend: Invalid abbrev number
```

```
fatal error: malformed or corrupted AST file: 'declaration ID out-of-range for AST file'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/3/1721172304169885905/1907ERGA34IRD'
```

I’ve done some research and it looks like the failure is inside single clang process, clang writes pcm down on disk, then reads it and gets a malformed file back, or some other problem with reading on-disk pcm.

I've found a patch https://reviews.llvm.org/D22636 and it looks like it fixes my problems, but I don’t want to use it, cause it just rebuilts "owned" module if its AST is corrupted and looks like retry of operation, not a real fix. Also this fix introduces some unstable behavior.

My questions:
1) Is this a known problem?
2) How can I safely hot-fix it before it will be fixed in clang?
3) Do I need to provide more logs for this issue to be fixed?

Thanks


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: ModuleCache issues on Linux (Ubuntu)

Deep Majumder via cfe-dev
Can't say any of this sounds especially familiar to me - cc'd a few folks who might have some context.

On Fri, Feb 19, 2021 at 6:28 AM Ilya Kuteev via cfe-dev <[hidden email]> wrote:
Hello everyone!
I've ran into a strange problem using clang-modules on Linux. I have a Linux machine running Ubuntu 18.04.4 LTS.
When I put my ModuleCache on RAM everything is OK
When I put ModuleCache on SSD/HDD, very frequently I'm getting different errors with pcm files, like below:
```
fatal error: module file '/home/ilyakuteev/module_cache/2/1721172304169885905/3OA4HVRBFA8AJ/Base-5W2XL4QH4QAH.pcm' is out of date and needs to be rebuilt: signature mismatch
```

```
fatal error: malformed or corrupted AST file: 'Unexpected end of file reading 4154469704 of 4738416 bytes'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/1721172304169885905/37WO4K6LHZNB5'
fatal error: error in backend: Invalid abbrev number
```

```
fatal error: malformed or corrupted AST file: 'declaration ID out-of-range for AST file'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/3/1721172304169885905/1907ERGA34IRD'
```

I’ve done some research and it looks like the failure is inside single clang process, clang writes pcm down on disk, then reads it and gets a malformed file back, or some other problem with reading on-disk pcm.

I've found a patch https://reviews.llvm.org/D22636 and it looks like it fixes my problems, but I don’t want to use it, cause it just rebuilts "owned" module if its AST is corrupted and looks like retry of operation, not a real fix. Also this fix introduces some unstable behavior.

My questions:
1) Is this a known problem?
2) How can I safely hot-fix it before it will be fixed in clang?
3) Do I need to provide more logs for this issue to be fixed?

Thanks

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: ModuleCache issues on Linux (Ubuntu)

Deep Majumder via cfe-dev
I figured out that the problem is in reading PCM files using FileManager.
clang/lib/Basic/FileManager.cpp:271
```
FileEntry &UFE = UniqueRealFiles[Status.getUniqueID()];
```
If I remove this cache everything is OK.
Seems like inode is shared between deleted/edited and newly created PCM file.
If it happens FileManager returns a FileEntryRef to an invalid file.

How can I fix this issue? Maybe we need to remove UniqueRealFiles cache
for PCM files, or make it work with this case?


1 марта 2021 г., в 20:13, David Blaikie <[hidden email]> написал(а):

Can't say any of this sounds especially familiar to me - cc'd a few folks who might have some context.

On Fri, Feb 19, 2021 at 6:28 AM Ilya Kuteev via cfe-dev <[hidden email]> wrote:
Hello everyone!
I've ran into a strange problem using clang-modules on Linux. I have a Linux machine running Ubuntu 18.04.4 LTS.
When I put my ModuleCache on RAM everything is OK
When I put ModuleCache on SSD/HDD, very frequently I'm getting different errors with pcm files, like below:
```
fatal error: module file '/home/ilyakuteev/module_cache/2/1721172304169885905/3OA4HVRBFA8AJ/Base-5W2XL4QH4QAH.pcm' is out of date and needs to be rebuilt: signature mismatch
```

```
fatal error: malformed or corrupted AST file: 'Unexpected end of file reading 4154469704 of 4738416 bytes'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/1721172304169885905/37WO4K6LHZNB5'
fatal error: error in backend: Invalid abbrev number
```

```
fatal error: malformed or corrupted AST file: 'declaration ID out-of-range for AST file'
note: after modifying system headers, please delete the module cache at '/home/ilyakuteev/module_cache/3/1721172304169885905/1907ERGA34IRD'
```

I’ve done some research and it looks like the failure is inside single clang process, clang writes pcm down on disk, then reads it and gets a malformed file back, or some other problem with reading on-disk pcm.

I've found a patch https://reviews.llvm.org/D22636 and it looks like it fixes my problems, but I don’t want to use it, cause it just rebuilts "owned" module if its AST is corrupted and looks like retry of operation, not a real fix. Also this fix introduces some unstable behavior.

My questions:
1) Is this a known problem?
2) How can I safely hot-fix it before it will be fixed in clang?
3) Do I need to provide more logs for this issue to be fixed?

Thanks

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev