Fix LogTailer's unbounded mem usage #1261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

pankit-eng wants to merge 1 commit into meta-pytorch:main from pankit-eng:export-D82412752

+116 −15

Contributor

pankit-eng commented Sep 18, 2025

Summary:
Problem:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently prone to bad user actor code leading to unbounded mem usage. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory.

Bad Actor code that exposed the issue:

class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)

Solution:

Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "".

Differential Revision: D82412752

meta-cla bot added the CLA Signed label

Contributor

facebook-github-bot commented Sep 18, 2025

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

facebook-github-bot added fb-exported meta-exported labels

pankit-eng added a commit to pankit-eng/monarch that referenced this pull request


          Fix LogTailer's unbounded mem usage (meta-pytorch#1261)

b8329c0

Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 1024 Bytes for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752

pankit-eng force-pushed the export-D82412752 branch from fa81eb0 to b8329c0 Compare

September 18, 2025 17:48

Contributor

facebook-github-bot commented Sep 18, 2025

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

pankit-eng force-pushed the export-D82412752 branch from b8329c0 to 6f69c5a Compare

September 18, 2025 21:35

Contributor

facebook-github-bot commented Sep 18, 2025

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

pankit-eng added a commit to pankit-eng/monarch that referenced this pull request


          Fix LogTailer's unbounded mem usage (meta-pytorch#1261)

d2c7812

Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752

pankit-eng force-pushed the export-D82412752 branch from 6f69c5a to d2c7812 Compare

September 18, 2025 23:57

Contributor

facebook-github-bot commented Sep 18, 2025

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.


          Fix LogTailer's unbounded mem usage (meta-pytorch#1261)

2dcd1da

Summary:

**Problem**:
In the current implementation, LogTailer uses string that elastically grows with log line size. This is because it uses buffer String object in the tee operation. LogTailer being the underlying implementation for piping use code's stduot/stderr is currently **prone to bad user actor code leading to unbounded mem usage**. And once the string buffer has grown to a given size, it remains at the same size leading to inefficient usage or hogging of memory. 

Bad Actor code that exposed the issue:
```
class LogBomber(Actor):  
    def __init__(self) -> None:  
        self.logger = logging.getLogger()  
        self.logger.setLevel(logging.INFO)  
  
    endpoint  
    async def spam_logs(self, num_logs: int, delay_ms: int = 0) -> None:  
        """Generate a massive number of logs in rapid succession"""  
        for i in range(num_logs):  
            # Generate both stdout and stderr logs to maximize channel pressure  
            print(f"STDOUT_SPAM_{i}: " + "X" * 1000000000, flush=True)  # Large log lines  
            self.logger.error(f"STDERR_SPAM_{i}: " + "Y" * 100000000)  # Large error logs  
              
            if delay_ms > 0 and i % 100 == 0:  
                await asyncio.sleep(delay_ms / 1000.0)  
```

**Solution**:

Limit the read to 256 KB for a single text line. The rest of the text is skipped and marked with "<TRUNCATED>".

Differential Revision: D82412752

pankit-eng force-pushed the export-D82412752 branch from d2c7812 to 2dcd1da Compare

September 19, 2025 19:07

Contributor

facebook-github-bot commented Sep 19, 2025

@pankit-eng has exported this pull request. If you are a Meta employee, you can view the originating diff in D82412752.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported meta-exported