Skip to content

Conversation

pankit-eng
Copy link
Contributor

Summary:
Problem:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently prone to bad user actor code leading to unbounded mem usage. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory.

Bad Actor code that exposed the issue:

class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  

Solution:

Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "".

Differential Revision: D82412752

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 18, 2025
@facebook-github-bot
Copy link
Contributor

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

pankit-eng added a commit to pankit-eng/monarch that referenced this pull request Sep 18, 2025
Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752
@facebook-github-bot
Copy link
Contributor

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

@facebook-github-bot
Copy link
Contributor

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

pankit-eng added a commit to pankit-eng/monarch that referenced this pull request Sep 18, 2025
Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752
@facebook-github-bot
Copy link
Contributor

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752
@facebook-github-bot
Copy link
Contributor

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants