方便更快捷的说明问题,可以按需填写(可删除)
使用环境:
springboot + shardingSphere5.0 + mybatis + mysql
场景、问题:
实现分库分表情况下,查询跨年的数据,返回了同一时间的两条数据,其中一条为0
已进行操作:
现状:
配置
方便更快捷的说明问题,可以按需填写(可删除)
springboot + shardingSphere5.0 + mybatis + mysql
实现分库分表情况下,查询跨年的数据,返回了同一时间的两条数据,其中一条为0
配置
这种情况需要开启 sql-show,并分析 Logic SQL 和 Actual SQL。
report_site_day_2021 传的参数还是 2021-12-20, 2022-01-18
report_site_day_2022 传的参数还是 2021-12-20, 2022-01-18
看了你的描述,我不太清楚具体问题。
能不能提供一个 demo 让我复现一下?
dimdate表:
CREATE TABLE dimdate
(
DateKey
INT(11) NOT NULL,
FullDateAlternateKey
DATE NOT NULL,
PRIMARY KEY (DateKey
),
KEY DateKey
(DateKey
) USING BTREE
) ENGINE=INNODB DEFAULT CHARSET=utf8 COMMENT=‘时间明细表’
数据表:
CREATE TABLE report_site_day
(
datatime
INT(11) NOT NULL,
innum
DATE NOT NULL,
PRIMARY KEY (datatime
),
KEY datatime
(datatime
) USING BTREE
) ENGINE=INNODB DEFAULT CHARSET=utf8 COMMENT=‘数据表’
一个查询语句查询的时间条件是2021-12-20到2022-01-18。
report_site_day为数据表 用时间和dimdate关联
sql:
SELECT dim.FullDateAlternateKey,rs.innum FROM dimdate dim LEFT JOIN report_site_day rs ON dim.FullDateAlternateKey = rs.datatime
WHERE dim.FullDateAlternateKey BETWEEN ‘2021-12-20’ AND ‘2022-01-18’
用分库report_site_day_2021 ,report_site_day_2022 分别对主表dimdate进行查询,最后实际返回的数量为原来的两倍
能把相关分表配置贴一下吗?
shardingsphere:
datasource:
ds0:
driver-class-name: com.mysql.cj.jdbc.Driver
jdbc-url: ***
password:
type: com.zaxxer.hikari.HikariDataSource
username: root
names: ds0
# names: ds2015,ds2016,ds2017,ds2018,ds2019,ds2020,ds2021
rules:
sharding:
tables:
report_site_day:
actual-data-nodes: ds0.report_site_day_$->{2015..2022}
table-strategy:
standard:
sharding-column: datatime
sharding-algorithm-name: my-db-range-sharding-algorithm
shardingAlgorithms:
my-db-range-sharding-algorithm:
type: CLASS_BASED
props:
strategy: standard
algorithmClassName: com.winner.algorithm.MyDBRangeShardingAlgorithm
props:
# sql-federation-enabled: true
sql-show: true
# sql-simple: true
dimdate 数据:
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211220’,‘2021-12-20’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211221’,‘2021-12-21’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211222’,‘2021-12-22’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211223’,‘2021-12-23’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211224’,‘2021-12-24’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211225’,‘2021-12-25’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211226’,‘2021-12-26’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211227’,‘2021-12-27’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211228’,‘2021-12-28’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211229’,‘2021-12-29’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211230’,‘2021-12-30’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20211231’,‘2021-12-31’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220101’,‘2022-01-01’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220102’,‘2022-01-02’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220103’,‘2022-01-03’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220104’,‘2022-01-04’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220105’,‘2022-01-05’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220106’,‘2022-01-06’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220107’,‘2022-01-07’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220108’,‘2022-01-08’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220109’,‘2022-01-09’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220110’,‘2022-01-10’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220111’,‘2022-01-11’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220112’,‘2022-01-12’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220113’,‘2022-01-13’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220114’,‘2022-01-14’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220115’,‘2022-01-15’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220116’,‘2022-01-16’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220117’,‘2022-01-17’);
insert into dimdate
(DateKey
, FullDateAlternateKey
) values(‘20220118’,‘2022-01-18’);
package com.winner.algorithm;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;
import java.util.Collection;
import java.util.LinkedHashSet;
import java.util.Set;
public class MyDBRangeShardingAlgorithm implements StandardShardingAlgorithm<Integer> {
@Override
public String doSharding(Collection<String> collection, PreciseShardingValue<Integer> preciseShardingValue) {
/**
* databaseNames 所有分片库的集合
* shardingValue 为分片属性,其中 logicTableName 为逻辑表,columnName 分片健(字段),value 为从 SQL 中解析出的分片健的值
*/
String year = String.valueOf(preciseShardingValue.getValue()).substring(0, 4);
for (String databaseName : collection) {
if (databaseName.endsWith(year)) {
System.out.println("return :" + year);
return databaseName;
}
}
return "";
}
@Override
public Collection<String> doSharding(Collection<String> collection, RangeShardingValue<Integer> rangeShardingValue) {
Set<String> result = new LinkedHashSet<>();
// // between and 的起始值
String lowerStr = String.valueOf(rangeShardingValue.getValueRange().lowerEndpoint()).substring(0, 4);
String upperStr = String.valueOf(rangeShardingValue.getValueRange().upperEndpoint()).substring(0, 4);
int lower = Integer.valueOf(lowerStr);
int upper = Integer.valueOf(upperStr);
// // 循环范围计算分库逻辑
for (int i = lower; i <= upper; i++) {
for (String databaseName : collection) {
if (databaseName.endsWith(String.valueOf(i))) {
result.add(databaseName);
}
}
}
return result;
}
@Override
public void init() {
}
@Override
public String getType() {
return null;
}
}
我看到你的配置了,report_site_day 这张表是分表,dimdate 这张表是字典表。
执行 sql
SELECT dim.FullDateAlternateKey,rs.innum FROM dimdate dim LEFT JOIN report_site_day rs ON dim.FullDateAlternateKey = rs.datatime
WHERE dim.FullDateAlternateKey BETWEEN '2021-12-20' AND '2022-01-18'
执行的结果有重复数据?
是这个问题吗?
是的,跨一年就有两倍的数据
看起来你的查询条件中的分片键和配置中的分片键不同。
改成这个试试
SELECT dim.FullDateAlternateKey,rs.innum FROM dimdate dim LEFT JOIN report_site_day rs ON dim.FullDateAlternateKey = rs.datatime WHERE rs.datatime BETWEEN '2021-12-20' AND '2022-01-18'
这种写法就达不到时间字典表的想要的效果。
@用户1 确实有数据重复的问题。后期我们可能会考虑采用查询优化修复这个问题。
从原理上看,因为字典表只有一张,但是分表有五张,另外没有与分片键相关的查询条件,那么会采用全路由的方式,即将 sql 拆成 5 条 sql 执行,并对结果进行归并,所以会出现重复数据。
可以考虑将分表做为主表。
可以考虑inner join或者加个条件判断 WHERE rs.innum is not null
貌似解决不了吧。
你可以在页面选择俩个年份数据 并保证这俩个年份都不在数据的情况下 是否对获取分页 COUNT 报数组越界