jrnlfile
是具有日记名称和标识符的数据集。以下是前6个观察点:
id journal issn
56201 ACTA HAEMATOLOGICA 0001-5792
94365 ACTA PHARMACOLOGICA SINICA
10334 ACTA PHARMACOLOGICA SINICA 1671-4083
55123 ADVANCES IN ENZYME REGULATION 0065-2571
90002 AGING
10403 AGING 1945-4589
比较id
94365和10334。这些obs的名称相同journal
。他们需要一样的东西issn
。缺少值的Obsissn
几乎总是至少有一个伙伴Obs,其中包含匹配的journal
名称和正确的issn
。无论这是什么情况,我都想重新编码丢失的内容,issn
以便包含issn
其他journal
提到的相同内容。修改后的数据集want
如下所示:
id journal issn
56201 ACTA HAEMATOLOGICA 0001-5792
94365 ACTA PHARMACOLOGICA SINICA 1671-4083
10334 ACTA PHARMACOLOGICA SINICA 1671-4083
55123 ADVANCES IN ENZYME REGULATION 0065-2571
90002 AGING 1945-4589
10403 AGING 1945-4589
我目前在数据步骤中使用if-else语句,issn
以为匹配项填充缺失值journal
:
data want;
set jrnlfile;
if journal = "ACTA PHARMACOLOGICA SINICA" then issn = "1671-4083";
else if journal = "AGING" then issn = "1945-4589";
/*continue for 7,000 other journals*/
run;
但是jrnlfile
包含50,000 obs和7,000种独特期刊,因此这需要花费大量时间,并且容易出错。这个答案让我半途而废,但issn
不是数字,我无法通过简单地添加值来解决问题。
什么是更有效和系统的方式去want
从jrnlfile
?
您可以使用保留声明。但是此代码有一些限制。要清空日记帐,将设置第一个找到的issn。期刊组必须有一个或多个isn。
proc sort data=JRNLFILE;
by journal descending issn;
run;
data want;
set JRNLFILE;
retain t_issn;
by journal descending issn;
if first.journal then
do;
if issn="" then do;
put "ERROR: there is no issn val for group";
stop;
end;
else t_issn =issn;
end;
if issn="" then
do;
issn=t_issn;
end;
run;
例如。如果使用此表:
+-------+------------------------------+-----------+
| id | journal | issn |
+-------+------------------------------+-----------+
| 94365 | ACTA PHARMACOLOGICA SINICA | |
| 10334 | ACTA PHARMACOLOGICA SINICA | 1671-4083 |
| 1 | ACTA PHARMACOLOGICA SINICA | A_TEST |
| 2 | ACTA PHARMACOLOGICA SINICA | WAS |
| 3 | ACTA PHARMACOLOGICA SINICA | SATRTED |
+-------+------------------------------+-----------+
你会得到:
+-------+----------------------------+-----------+--------+
| id | journal | issn | t_issn |
+-------+----------------------------+-----------+--------+
| 2 | ACTA PHARMACOLOGICA SINICA | WAS | WAS |
| 3 | ACTA PHARMACOLOGICA SINICA | SATRTED | WAS |
| 1 | ACTA PHARMACOLOGICA SINICA | A_TEST | WAS |
| 10334 | ACTA PHARMACOLOGICA SINICA | 1671-4083 | WAS |
| 94365 | ACTA PHARMACOLOGICA SINICA | WAS | WAS |
+-------+----------------------------+-----------+--------+
错误示例。如果使用此表:
+-------+------------------------------+-----------+
| id | journal | issn |
+-------+------------------------------+-----------+
| 56201 | ACTA HAEMATOLOGICA | 0001-5792 |
| 94365 | ACTA PHARMACOLOGICA SINICA | |
+-------+------------------------------+-----------+
您将得到一个错误:
错误:群组没有issn val
* t_issn不用理解功能了:))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句