Skip to content

branch-3.1: [fix](inverted index) Refine char_group tokenizer validation #55126 #55191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: branch-3.1
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

Cherry-picked from #55126

@github-actions github-actions bot requested a review from morrySnow as a code owner August 22, 2025 10:34
@Thearas
Copy link
Contributor

Thearas commented Aug 22, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Aug 22, 2025
@Thearas
Copy link
Contributor

Thearas commented Aug 22, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32746 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e82639eb605cd4f4fb59790be4cfd3b9fafc0b54, data reload: false

------ Round 1 ----------------------------------
q1	17592	5454	5425	5425
q2	2023	425	283	283
q3	11402	1233	749	749
q4	10494	877	463	463
q5	9540	2414	2129	2129
q6	186	162	133	133
q7	903	730	624	624
q8	9336	1427	1127	1127
q9	5320	5008	4982	4982
q10	6774	2267	1827	1827
q11	464	274	262	262
q12	347	354	212	212
q13	17780	3641	3023	3023
q14	225	238	212	212
q15	524	463	464	463
q16	429	435	378	378
q17	618	862	370	370
q18	6830	6311	6348	6311
q19	1219	961	572	572
q20	338	351	221	221
q21	2849	2168	1979	1979
q22	1059	1033	1001	1001
Total cold run time: 106252 ms
Total hot run time: 32746 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6333	5464	5504	5464
q2	237	326	240	240
q3	2259	2654	2323	2323
q4	1302	1791	1405	1405
q5	4421	4993	4921	4921
q6	176	166	126	126
q7	2072	1954	1826	1826
q8	2615	2823	2706	2706
q9	7258	7256	7388	7256
q10	3088	3258	2806	2806
q11	575	508	491	491
q12	699	774	626	626
q13	3487	3885	3243	3243
q14	282	291	286	286
q15	524	458	475	458
q16	438	496	438	438
q17	1250	1744	1275	1275
q18	7628	7423	7264	7264
q19	830	1193	1098	1098
q20	2039	2026	1918	1918
q21	5246	4802	4575	4575
q22	1102	1053	977	977
Total cold run time: 53861 ms
Total hot run time: 51722 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193519 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e82639eb605cd4f4fb59790be4cfd3b9fafc0b54, data reload: false

query1	958	395	384	384
query2	6240	1993	1959	1959
query3	8687	205	202	202
query4	33889	24105	23504	23504
query5	3694	604	455	455
query6	298	198	188	188
query7	4220	508	318	318
query8	314	249	243	243
query9	9340	2625	2611	2611
query10	488	338	262	262
query11	18237	15495	15237	15237
query12	154	109	106	106
query13	1550	542	427	427
query14	9042	7362	7613	7362
query15	225	200	187	187
query16	8065	669	475	475
query17	1599	794	614	614
query18	2212	429	332	332
query19	240	191	168	168
query20	131	124	122	122
query21	211	128	109	109
query22	4526	4579	4526	4526
query23	34967	34341	33948	33948
query24	7310	2802	2791	2791
query25	532	504	423	423
query26	806	304	175	175
query27	1945	493	379	379
query28	5655	2224	2150	2150
query29	705	650	488	488
query30	246	194	163	163
query31	1004	950	861	861
query32	87	65	58	58
query33	513	407	362	362
query34	756	882	524	524
query35	769	808	774	774
query36	1020	1060	961	961
query37	110	95	69	69
query38	4063	4059	3959	3959
query39	1525	1490	1482	1482
query40	210	126	113	113
query41	62	51	51	51
query42	125	112	104	104
query43	535	525	495	495
query44	1412	830	832	830
query45	199	181	176	176
query46	917	1094	707	707
query47	1993	1947	1928	1928
query48	426	446	355	355
query49	751	485	411	411
query50	699	702	453	453
query51	7475	7349	7355	7349
query52	108	103	92	92
query53	238	269	196	196
query54	567	564	500	500
query55	84	82	139	82
query56	275	279	264	264
query57	1254	1260	1194	1194
query58	227	223	228	223
query59	3076	3175	3149	3149
query60	330	295	277	277
query61	118	113	123	113
query62	815	793	705	705
query63	246	206	207	206
query64	3847	993	654	654
query65	3356	3269	3330	3269
query66	785	416	309	309
query67	16191	15724	15424	15424
query68	7421	835	538	538
query69	516	320	276	276
query70	1209	1151	1119	1119
query71	375	320	272	272
query72	5825	3964	3887	3887
query73	628	757	355	355
query74	10403	9490	9007	9007
query75	3289	3134	2672	2672
query76	3210	1178	788	788
query77	669	389	283	283
query78	10295	10387	9622	9622
query79	3740	925	590	590
query80	771	550	445	445
query81	496	254	219	219
query82	567	114	88	88
query83	174	164	150	150
query84	284	114	81	81
query85	786	367	300	300
query86	355	312	304	304
query87	4372	4303	4221	4221
query88	3766	2439	2410	2410
query89	436	327	299	299
query90	1853	195	195	195
query91	139	141	110	110
query92	63	57	53	53
query93	2076	907	552	552
query94	690	398	326	326
query95	346	291	279	279
query96	497	629	295	295
query97	3298	3314	3270	3270
query98	221	211	204	204
query99	1601	1386	1322	1322
Total cold run time: 293025 ms
Total hot run time: 193519 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.12 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e82639eb605cd4f4fb59790be4cfd3b9fafc0b54, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.03
query3	0.24	0.07	0.07
query4	1.62	0.11	0.10
query5	0.53	0.53	0.50
query6	1.14	0.73	0.73
query7	0.02	0.02	0.02
query8	0.04	0.04	0.04
query9	0.56	0.52	0.50
query10	0.56	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.11	0.10
query13	0.61	0.60	0.59
query14	0.78	0.80	0.79
query15	0.87	0.85	0.84
query16	0.40	0.40	0.38
query17	1.06	1.00	0.99
query18	0.25	0.23	0.24
query19	1.95	1.83	1.87
query20	0.02	0.01	0.02
query21	15.37	0.90	0.57
query22	0.73	0.81	0.64
query23	15.14	1.38	0.62
query24	3.50	1.04	1.17
query25	0.20	0.18	0.10
query26	0.28	0.15	0.13
query27	0.04	0.06	0.06
query28	13.33	0.97	0.44
query29	12.59	3.91	3.24
query30	0.26	0.09	0.05
query31	2.84	0.60	0.38
query32	3.23	0.54	0.46
query33	3.02	3.04	3.07
query34	16.59	5.22	4.57
query35	4.60	4.58	4.56
query36	0.65	0.49	0.48
query37	0.09	0.05	0.05
query38	0.05	0.04	0.04
query39	0.04	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.03
query42	0.03	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 103.91 s
Total hot run time: 29.12 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants