Add Runner for Integration Tests for the SOM language #128

rhys-h-walker · 2025-07-09T09:36:11Z

This pull request adds the implementation of the test runner for the integration tests previously added in #129. The runner uses PyTest.

To run the tests, one needs to set a few environment variables.
In the current state, this should be something like:

export VM=./som.sh
export CLASSPATH=./core-lib/Smalltalk
export AWFY=./core-lib/Examples/AreWeFastYet/Core
TEST_EXPECTATIONS=integration-tests.yml python3 -m pytest

The file specified with TEST_EXPECTATIONS lists the tests that are expected to fail for a specific SOM implementation. This is needed, because various language features are currently not specified, and yksom made its own choices.

Core-Lib Updates in the SOM Implementations

Status	SOM	PR
✅	SOM (java)	SOM-st/som-java#40
✅	TruffleSOM	SOM-st/TruffleSOM#235
✅	PySOM	SOM-st/PySOM#67
✅	JsSOM	SOM-st/JsSOM#21
✅	SOM++	SOM-st/SOMpp#76
	SOM-RS	https://github.com/Hirevo/som-rs/pull/

.github/workflows/ci.yml

IntegrationTests/conftest.py

IntegrationTests/test_runner.py

.github/workflows/ci.yml

IntegrationTests/conftest.py

IntegrationTests/test_runner.py

smarr · 2025-07-11T19:28:40Z

Another small issue:

I have test that passes, but was expected to fail. Reporting this should be minimal, and does not need to show the whole stuff.

Currently, it shows this:

E           AssertionError: Test core-lib/IntegrationTests/Tests/integer_asdouble.som is in failing_as_unspecified but passed
E
E             Expected stdout:
E             1|    0.0
E             2|    1.0
E             3|    -1.0
E             4|    1.8446744073709552e19
E             5|    -1.8446744073709552e19
E             Given stdout   :
E             1|    0.0
E             2|    1.0
E             3|    -1.0
E             4|    1.8446744073709552e19
E             5|    -1.8446744073709552e19
E             6|
E             Expected stderr:
E
E             Given stderr   :
E             1|
E             Command used   : ./som.sh -cp Smalltalk core-lib/IntegrationTests/Tests/integer_asdouble.som
E             Case sensitive : False
E             Stdout diff    :
E               0.0
E               1.0
E               -1.0
E               1.8446744073709552e19
E             - -1.8446744073709552e19+ -1.8446744073709552e19
E             ?                       +
E
E             Stderr diff    :
E
E
E           assert False

But all that's needed is:
E AssertionError: Test core-lib/IntegrationTests/Tests/integer_asdouble.som is in failing_as_unspecified but passed

Also, it looks like there's something odd with the diff:

E - -1.8446744073709552e19+ -1.8446744073709552e19

IntegrationTests/Tests/case_insensitive.som

IntegrationTests/test_runner.py

rhys-h-walker · 2025-07-14T11:17:18Z

Another small issue:

I have test that passes, but was expected to fail. Reporting this should be minimal, and does not need to show the whole stuff.

Currently, it shows this:

E           AssertionError: Test core-lib/IntegrationTests/Tests/integer_asdouble.som is in failing_as_unspecified but passed
E
E             Expected stdout:
E             1|    0.0
E             2|    1.0
E             3|    -1.0
E             4|    1.8446744073709552e19
E             5|    -1.8446744073709552e19
E             Given stdout   :
E             1|    0.0
E             2|    1.0
E             3|    -1.0
E             4|    1.8446744073709552e19
E             5|    -1.8446744073709552e19
E             6|
E             Expected stderr:
E
E             Given stderr   :
E             1|
E             Command used   : ./som.sh -cp Smalltalk core-lib/IntegrationTests/Tests/integer_asdouble.som
E             Case sensitive : False
E             Stdout diff    :
E               0.0
E               1.0
E               -1.0
E               1.8446744073709552e19
E             - -1.8446744073709552e19+ -1.8446744073709552e19
E             ?                       +
E
E             Stderr diff    :
E
E
E           assert False

But all that's needed is: E AssertionError: Test core-lib/IntegrationTests/Tests/integer_asdouble.som is in failing_as_unspecified but passed

Also, it looks like there's something odd with the diff:

E - -1.8446744073709552e19+ -1.8446744073709552e19

Assertions that fail only by being in unspecified/known/unuspported now just show that they passed and are located inside of the tags file.

IntegrationTests/test_runner.py

smarr · 2025-07-15T22:18:22Z

Hm, there's some problem with the current output.

Looks like everything is turned lower case:

E     stdout diff with stdout expected
E
E       0
E     - 1
E     - true
E     - false
E     - false
E     - false
E     - true
E     - false
E     - false
E     - false
E     stderr diff with stderr expected
E
E     + warning: a terminally deprecated method in sun.misc.unsafe has been called
E     + warning: sun.misc.unsafe::objectfieldoffset has been called by com.oracle.truffle.api.nodes.nodeclassimpl$nodefielddata (file:/users/smarr/projects/som/trufflesom/../graal/truffle/mxbuild/dists/truffle-api.jar)
E     + warning: please consider reporting this to the maintainers of class com.oracle.truffle.api.nodes.nodeclassimpl$nodefielddata
E     + warning: sun.misc.unsafe::objectfieldoffset will be removed in a future release
E     + exception in thread "main" org.graalvm.polyglot.polyglotexception: com.oracle.truffle.api.dsl.unsupportedspecializationexception: unexpected values provided for booleaninlinedliteralnode.andinlinedliteralnode@51549490: [1], [long]
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode.evaluateargument(booleaninlinedliteralnode.java:48)
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode$andinlinedliteralnode.executeboolean(booleaninlinedliteralnode.java:69)
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode$andinlinedliteralnode.executegeneric(booleaninlinedliteralnode.java:63)
E     + 	at trufflesom/trufflesom.interpreter.nodes.sequencenode.executegeneric(sequencenode.java:48)
E     + 	at trufflesom/trufflesom.interpreter.method.execute(method.java:62)
E     + 	at <som> bool4>>#run(unknown)
E     + 	at <som> system>>#initialize:(smalltalk/system.som:48:1961-1963)
E     + 	at <som> (unknown)
E     + 	at org.graalvm.polyglot/org.graalvm.polyglot.context.eval(context.java:418)
E     + 	at trufflesom/trufflesom.launcher.eval(launcher.java:32)
E     + 	at trufflesom/trufflesom.launcher.main(launcher.java:13)
E     + original internal error:
E     + com.oracle.truffle.api.dsl.unsupportedspecializationexception: unexpected values provided for booleaninlinedliteralnode.andinlinedliteralnode@51549490: [1], [long]

rhys-h-walker · 2025-07-15T22:57:05Z

Hm, there's some problem with the current output.

Looks like everything is turned lower case:

E     stdout diff with stdout expected
E
E       0
E     - 1
E     - true
E     - false
E     - false
E     - false
E     - true
E     - false
E     - false
E     - false
E     stderr diff with stderr expected
E
E     + warning: a terminally deprecated method in sun.misc.unsafe has been called
E     + warning: sun.misc.unsafe::objectfieldoffset has been called by com.oracle.truffle.api.nodes.nodeclassimpl$nodefielddata (file:/users/smarr/projects/som/trufflesom/../graal/truffle/mxbuild/dists/truffle-api.jar)
E     + warning: please consider reporting this to the maintainers of class com.oracle.truffle.api.nodes.nodeclassimpl$nodefielddata
E     + warning: sun.misc.unsafe::objectfieldoffset will be removed in a future release
E     + exception in thread "main" org.graalvm.polyglot.polyglotexception: com.oracle.truffle.api.dsl.unsupportedspecializationexception: unexpected values provided for booleaninlinedliteralnode.andinlinedliteralnode@51549490: [1], [long]
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode.evaluateargument(booleaninlinedliteralnode.java:48)
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode$andinlinedliteralnode.executeboolean(booleaninlinedliteralnode.java:69)
E     + 	at trufflesom/trufflesom.interpreter.nodes.specialized.booleaninlinedliteralnode$andinlinedliteralnode.executegeneric(booleaninlinedliteralnode.java:63)
E     + 	at trufflesom/trufflesom.interpreter.nodes.sequencenode.executegeneric(sequencenode.java:48)
E     + 	at trufflesom/trufflesom.interpreter.method.execute(method.java:62)
E     + 	at <som> bool4>>#run(unknown)
E     + 	at <som> system>>#initialize:(smalltalk/system.som:48:1961-1963)
E     + 	at <som> (unknown)
E     + 	at org.graalvm.polyglot/org.graalvm.polyglot.context.eval(context.java:418)
E     + 	at trufflesom/trufflesom.launcher.eval(launcher.java:32)
E     + 	at trufflesom/trufflesom.launcher.main(launcher.java:13)
E     + original internal error:
E     + com.oracle.truffle.api.dsl.unsupportedspecializationexception: unexpected values provided for booleaninlinedliteralnode.andinlinedliteralnode@51549490: [1], [long]

Output will be converted to lower case if case_sensitive is not set to be true in the test file. If case sensitivity is imperative for a test then it needs to be specified in the test file. The diff just reflects that, unless I am missing something in the output.

smarr · 2025-07-16T09:49:38Z

Output will be converted to lower case if case_sensitive is not set to be true in the test file. If case sensitivity is imperative for a test then it needs to be specified in the test file. The diff just reflects that, unless I am missing something in the output.

Hm, right. True, it's the diff.
I found it confusing seeing it like this. Hmmm...

rhys-h-walker · 2025-07-16T11:44:51Z

Output will be converted to lower case if case_sensitive is not set to be true in the test file. If case sensitivity is imperative for a test then it needs to be specified in the test file. The diff just reflects that, unless I am missing something in the output.

Hm, right. True, it's the diff. I found it confusing seeing it like this. Hmmm...

I could show the output in upper case and then show the diff, but I think the diff showing what is actually compared makes more sense otherwise it would point out case differences.

smarr · 2025-07-16T23:21:02Z

@rhys-h-walker I squashed lots of commits, to have a better overview. I also rebased on the new master, that contains the merge with integration tests.

Please have a look at the commits I added, there were various bits that I pointed out earlier, and that simplify the code a bit.

Please also have a look at the test_tester stuff.
At the moment it depends on the directory pytest is executed in whether the tests pass or not.

This passes: core-lib/IntegrationTests$ pytest -s -v test_tester.py -m tester

But from one level up, it doesn't work: core-lib$ pytest -s -v IntegrationTests/test_tester.py -m tester

>       assert external_vars.known_failures == test_list
E       AssertionError: assert ['Integration...hod/test.som'] == ['./Tests/mut...hod/test.som']
E         
E         At index 0 diff: 'IntegrationTests/Tests/mutate_superclass_method/test.som' != './Tests/mutate_superclass_method/test.som'
E         
E         Full diff:
E           [
E         -     './Tests/mutate_superclass_method/test.som',
E         ?      ^...

rhys-h-walker · 2025-07-17T08:48:09Z

@rhys-h-walker I squashed lots of commits, to have a better overview. I also rebased on the new master, that contains the merge with integration tests.

Please have a look at the commits I added, there were various bits that I pointed out earlier, and that simplify the code a bit.

Please also have a look at the test_tester stuff. At the moment it depends on the directory pytest is executed in whether the tests pass or not.

This passes: core-lib/IntegrationTests$ pytest -s -v test_tester.py -m tester

But from one level up, it doesn't work: core-lib$ pytest -s -v IntegrationTests/test_tester.py -m tester
>       assert external_vars.known_failures == test_list
E       AssertionError: assert ['Integration...hod/test.som'] == ['./Tests/mut...hod/test.som']
E         
E         At index 0 diff: 'IntegrationTests/Tests/mutate_superclass_method/test.som' != './Tests/mutate_superclass_method/test.som'
E         
E         Full diff:
E           [
E         -     './Tests/mutate_superclass_method/test.som',
E         ?      ^...

Yep, that's an easy enough fix I think one of the testers is missing a relative path.

I have also noticed that the GENERATE_REPORT function no longer works correctly. The names it outputs for test include the whole core-lib/IntegrationTests so it no longer works directly as a TEST_EXCEPTIONS file. I'll fix that too

rhys-h-walker · 2025-07-17T09:24:06Z

Also, Tests/vector_awfy_capacity.som may need updating, it features a fixed classpath that is passed to the SOM++ interpreter.

VM:
    status: success
    custom_classpath: ./core-lib/Examples/AreWeFastYet/Core:./core-lib/Smalltalk
    stdout:
        50
        100
        10

It'll need some kind of relative path location added to it

smarr · 2025-07-17T09:30:46Z

It'll need some kind of relative path location added to it

hm, yeah, I suppose that's a little different to the load_file case where it's inside the SOM program. Not sure whether we can find a consistent way of deal with both.

rhys-h-walker · 2025-07-17T13:45:23Z

It'll need some kind of relative path location added to it

hm, yeah, I suppose that's a little different to the load_file case where it's inside the SOM program. Not sure whether we can find a consistent way of deal with both.

This issue is resolved, specify through @tag an environment variable which can be used in place for a custom_classpath.


AWFY=./core-lib/Examples/AreWeFastYet/Core
CLASSPATH=./Smalltalk

"
VM:
    status: success
    custom_classpath: @AWFY:@CLASSPATH
    stdout:
        nil

"

custom_classpath= ./core-lib/Examples/AreWeFastYet/Core:./Smalltalk

IntegrationTests/test_runner.py

This is a PyTest-based runner for the integration tests of SOM. Inspired by lang_tester https://github.com/softdevteam/lang_tester - updated CI to run Black and PyLint, and integration tests - it can generate yaml file with currently failing tests. It can function as a TEST_EXCEPTIONS file with failing tests and passing tests, added and removed as needed. It should not be blindly followed but makes the generation of a tags file very easy. - it comes with additional information like number passed/skipped and which ones exactly have been changed. - Tests are case insensitive unless instructed not to be through `case_sensitive: true` in the tests. - a test can be expected to fail. This allows for testing of test_runner features. Not ideal but makes sure case_sensitivity works correctly.

… string. - skip test on Unicode Failure

- Removed Print statements leftover - Updated error handling to just show diff unless no diff is available - EXECUTABLE changed to be VM - Added some basic test_runner tests - Make a diff reverted to just be a diff and not have line numbers - be strict on inclusion in stdout/stderr - Removed the lang_tests and adjusted the test_runner logic to now print a proper error on /Tests folder not found - Fix conftest to output a string rather than an object for exitstatus - Fixed error with missing or empty tag file, addded test to assert that the yaml files are working correctly - Added an additional test to check the robustness of read_test_exceptions - Fixed an error where specifying case_sensitive: False would not register as False but True instead - Added a new tests to test the discovery of lang_tester valid tests inside a directory - Pathname for TEST_EXCEPTIONS now just takes a filename if it is located within IntegrationTests. - Adjusted test_test_diccovery to use relative path names rather than hard coded - There are now two options for running pytest, normally and execute just the som tests and with the argument -m tester which will run the test_runner tests located in test_tester - Added deselect message when tests are deselected due to runner choice - Filepaths must now be given from IntegrationTests onwards so: Tests/test.som - Updated to now use relative paths rather than hardcoded for TEST_EXCEPTIONS

…k if there for word on right.

Signed-off-by: Stefan Marr <[email protected]>

- simplify assertions, pytest already shows that info - make failures pytest.fail, and move reading of envvars into prepare_tests() Signed-off-by: Stefan Marr <[email protected]>

- support relative paths from CWD for testing - GENERATE_REPORT is now compatible with multiple directories. It will remove anything in the path from before Tests/ - test exceptions no longer alters path to yaml file - can now specify @tag_name in custom_classpath to load an environment variable by that name. Useful for custom classpath loading - updated pytest parameterize to not run prepare_tests twice and IDs now match what is expected - updated GENERATE_REPORT to have all tests with a consistent name from Tests/ - remove redundant assert messages - split parse_test_file into methods, allows for more fine grained testing - updated README - added a new test for check_partial_word

- turn test into parameterized ones, which gives errors that are more explicit about what’s going wrong - rename and restructure various bits - rewrite parser of test definitions to be a more classic single-pass parser - reify test definition as object, and make env var failures test definitions that fail as a test - avoid opening and reading file content when detecting tests. First, we find candidates, and then we check whether we can parse them. - build the error message from the test definition - test and handle case sensitive setting - use an object for report details, instead of a bunch of globals on the module - rename envvar to TEST_EXPECTATIONS - use $ to identify envvars in the test configuration - rename envvar to GENERATE_EXPECTATIONS_FILE - revise README.md Signed-off-by: Stefan Marr <[email protected]>

- SOM-RS shouldn’t need custom test command anymore - JAVA_HOME for TruffleSOM should already be set by the build step - set env vars for PySOM in build step - use GITHUB_ENV file for relevant variables Signed-off-by: Stefan Marr <[email protected]>

…linked Signed-off-by: Stefan Marr <[email protected]>

It’s a little buggy… Signed-off-by: Stefan Marr <[email protected]>

This PR adds the support for running the SOM integration tests as introduced with SOM-st/SOM#128 and SOM-st/SOM#129. SOM++ has various tests not passing, often because SOM does not yet specify what the expected behavior should be.

smarr reviewed Jul 10, 2025

View reviewed changes

smarr reviewed Jul 11, 2025

View reviewed changes

IntegrationTests/conftest.py Outdated Show resolved Hide resolved

smarr reviewed Jul 11, 2025

View reviewed changes