使用 PostgreSQL 解决一个实际的统计分析问题「建议收藏」

老牧童 • 2022-12-15 18:50 • postgreSql • 阅读 345

大家好，欢迎来到IT知识分享网。

使用 PostgreSQL 解决一个实际的统计分析问题
作者：老农民（刘启华）
Email: 46715422@qq.com
之前有个朋友扔给我一个奇葩需求，他们公司之前做了一批问卷调查，全部都是统一格式的excel文档，大约有近1300个问卷结果，都分布在1300多个excel文件中，有32个问卷题目，有单选和多选题，这个朋友说手工去集中分析几乎没什么办法了，问我有什么法子解决，可以付费请我帮忙。
每个题目的可选项有A，B，C，D，E，F，也就是他们需要统计每个题目对应选项的选择情况和占比等，我拿到之后仔细看了excel的格式情况，最后写了几段python代码，将所有的excel文档对应的题目选项结果集中写入到一个csv文件中，一个问卷excel文件对应一条记录，csv结果集如下截图参考：

IT知识分享网

IT知识分享网-- 建表 wj ，然后将csv文件导入到表 wj 中
create table wj
(
 dept character varying(20) not null,
 tm1 character varying(8),
 tm2 character varying(8),
 tm3 character varying(8)
);
alter table wj
 owner to postgres;
-- 查询导入的数据情况：
select * from wj;

/*
创建关键的统计分析视图 wj_st，因为需要将多选的题目进行列转多行，这里我们需要使用字符串切割函数 regexp_split_to_table：
*/
create view wj_st as
 select a.dept,
 a.tm,
 a.val,
 a.xj
 from ( select dept,
 'tm1' as tm,
 regexp_split_to_table(coalesce(tm1, ''), '') as val,
 count(*) as xj
 from wj
 group by dept, regexp_split_to_table(coalesce(tm1, ''), '')
union all
select dept,
 'tm2' as tm,
 regexp_split_to_table(coalesce(tm2, ''), '') as val,
 count(*) as xj
 from wj
 group by dept, regexp_split_to_table(coalesce(tm2, ''), '')
union all
select dept,
 'tm3' as tm,
 regexp_split_to_table(coalesce(tm3, ''), '') as val,
 count(*) as xj
 from wj
 group by dept, regexp_split_to_table(coalesce(tm3, ''), '')) a;
alter table wj_st
 owner to postgres;
-- 查询视图结果：
select * from wj_st;

IT知识分享网-- 先执行以下语句创建扩展表函数：
CREATE EXTENSION tablefunc;
-- 针对视图 wj_st 统计出每个部门的题目选项情况：
select 
substr(depttm,1,char_length(depttm)-3) 部门,
substr(depttm,char_length(depttm)-2) 题目,
 a,
 b,
 c,
 d,
 e,
 f
 from crosstab
 (
	 'select dept||tm,val,xj from wj_st order by 1',
	 'select distinct val from wj_st order by 1') as 
	 (depttm text, a integer, b integer, c integer, d integer, e integer, f integer);

我们可以针对以上的统计数据进行相关的需求统计了！

免责声明：本站所有文章内容,图片，视频等均是来源于用户投稿和互联网及文摘转载整编而成，不代表本站观点，不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益，请在线联系站长,一经查实,本站将立刻删除。本文来自网络,若有侵权，请联系删除，如若转载，请注明出处：https://yundeesoft.com/6539.html

使用 PostgreSQL 解决一个实际的统计分析问题「建议收藏」

相关推荐

发表回复