Pivot 将行复制到新列 Pandas
- 作者: 大兵0小胖
- 来源: 51数据库
- 2022-10-27
问题描述
我有一个这样的数据框,我正在尝试使用 Pandas 中的 Pivot 重塑我的数据框,这样我可以保留原始行中的一些值,同时将重复的行变成列并重命名它们.有时我有 5 个重复的行
I have a data frame like this and I'm trying reshape my data frame using Pivot from Pandas in a way that I can keep some values from the original rows while making the duplicates row into columns and renaming them. Sometimes I have rows with 5 duplicates
我一直在尝试,但我不明白.
I have been trying, but I don't get it.
import pandas as pd
df = pd.read_csv("C:dummy")
df = df.pivot(index=["ID"], columns=["Zone","PTC"], values=["Zone","PTC"])
# Rename columns and reset the index.
df.columns = [["PTC{}","Zone{}"],.format(c) for c in df.columns]
df.reset_index(inplace=True)
# Drop duplicates
df.drop(["PTC","Zone"], axis=1, inplace=True)
输入
ID Agent OV Zone Value PTC 1 10 26 M1 10 100 2 26.5 8 M2 50 95 2 26.5 8 M1 6 5 3 4.5 6 M3 4 40 3 4.5 6 M4 6 60 4 1.2 0.8 M1 8 100 5 2 0.4 M1 6 10 5 2 0.4 M2 41 86 5 2 0.4 M4 2 4
输出
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3 1 10 26 M_1 10 100 0 0 0 0 0 0 2 26.5 8 M_2 50 95 M_1 6 5 0 0 0 3 4.5 6 M_3 4 40 M_4 6 60 0 0 0 4 1.2 0.8 M_1 8 100 0 0 0 0 0 0 5 2 0.4 M_1 6 10 M_2 41 86 M_4 2 4
推荐答案
使用cumcount 用于计数组,由 MultiIndex/pandas-docs/stable/generated/pandas.DataFrame.set_index.html" rel="nofollow noreferrer">set_index 与 unstack 和列的最后展平值:
Use cumcount for count groups, create MultiIndex by set_index with unstack and last flatten values of columns:
g = df.groupby(["ID","Agent", "OV"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack(fill_value=0).sort_index(axis=1, level=1)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0 1 10.0 26.0 M1 10 100 0 0 0 0 0 0
1 2 26.5 8.0 M2 50 95 M1 6 5 0 0 0
2 3 4.5 6.0 M3 4 40 M4 6 60 0 0 0
3 4 1.2 0.8 M1 8 100 0 0 0 0 0 0
4 5 2.0 0.4 M1 6 10 M2 41 86 M4 2 4
如果只想替换为 0 数字列:
If want replace to 0 only numeric columns:
g = df.groupby(["ID","Agent"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack().sort_index(axis=1, level=1)
idx = pd.IndexSlice
df.loc[:, idx[['Value','PTC']]] = df.loc[:, idx[['Value','PTC']]].fillna(0).astype(int)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.fillna('').reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0 1 10.0 26.0 M1 10 100 0 0 0 0
1 2 26.5 8.0 M2 50 95 M1 6 5 0 0
2 3 4.5 6.0 M3 4 40 M4 6 60 0 0
3 4 1.2 0.8 M1 8 100 0 0 0 0
4 5 2.0 0.4 M1 6 10 M2 41 86 M4 2 4
推荐阅读
热点文章
Discord.py(重写)on_member_update 无法正常工作
0
Discord.py 在 vc 中获取用户分钟数
0
discord.py 重写 |为我的命令出错
0
Discord.py rewrite 如何 DM 命令?
0
播放音频时,最后一部分被切断.如何解决这个问题?(discord.py)
0
在消息删除消息 Discord.py
0
如何使 discord.py 机器人私人/直接消息不是作者的人?
0
(Discord.py) 如何获取整个嵌入内容?
0
Discord bot 尽管获得了许可,但不能提及所有人
0
Discord.py discord.NotFound 异常
0
