-
Notifications
You must be signed in to change notification settings - Fork 73
Fix LogTailer's unbounded mem usage #1261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752. |
Summary: **Problem**: In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. Bad Actor code that exposed the issue: ``` class LogBomber(Actor): def __init__(self) -> None: self.logger = logging.getLogger() self.logger.setLevel(logging.INFO) endpoint async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None: """Generate a massive number of logs in rapid succession""" for i in range(num_logs): # Generate both stdout and stderr logs to maximize channel pressure print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True) # Large log lines self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000) # Large error logs if delay_ms > 0 and i % 100 == 0: await asyncio.sleep(delay_ms / 1000.0) ``` **Solution**: Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>". Differential Revision: D82412752
fa81eb0
to
b8329c0
Compare
@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752. |
b8329c0
to
6f69c5a
Compare
@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752. |
Summary: **Problem**: In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. Bad Actor code that exposed the issue: ``` class LogBomber(Actor): def __init__(self) -> None: self.logger = logging.getLogger() self.logger.setLevel(logging.INFO) endpoint async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None: """Generate a massive number of logs in rapid succession""" for i in range(num_logs): # Generate both stdout and stderr logs to maximize channel pressure print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True) # Large log lines self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000) # Large error logs if delay_ms > 0 and i % 100 == 0: await asyncio.sleep(delay_ms / 1000.0) ``` **Solution**: Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>". Differential Revision: D82412752
6f69c5a
to
d2c7812
Compare
@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752. |
Summary: **Problem**: In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. Bad Actor code that exposed the issue: ``` class LogBomber(Actor): def __init__(self) -> None: self.logger = logging.getLogger() self.logger.setLevel(logging.INFO) endpoint async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None: """Generate a massive number of logs in rapid succession""" for i in range(num_logs): # Generate both stdout and stderr logs to maximize channel pressure print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True) # Large log lines self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000) # Large error logs if delay_ms > 0 and i % 100 == 0: await asyncio.sleep(delay_ms / 1000.0) ``` **Solution**: Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>". Differential Revision: D82412752
d2c7812
to
2dcd1da
Compare
@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752. |
Summary:
Problem:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently prone to bad user actor code leading to unbounded mem usage. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory.
Bad Actor code that exposed the issue:
Solution:
Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "".
Differential Revision: D82412752