Inventory a practical case of Python automation office

Oct 17, 2022

| Share on:

Hello everyone, I’m Pippi.

I. Introduction

A few days ago in the Python diamond exchange group [Hxy let me fat] asked a question about Python automation office, the screenshot of the question is as follows:

The desired effect is as follows:

To be precise, this is not a problem, but a real need.

Second, the realization process

Here [Jason] gave a feasible idea, as follows:

Later, [Mr. Yu Liang] gave a specific code, as shown below:

import re

from docx import Document

import pandas as pd

document = Document(“Judgment (Bracket Processing) (1).docx”)

all_paragraphs = document.paragraphs

data = [paragraph.text for paragraph in all_paragraphs if ‘√’ in paragraph.text or ‘×’ in paragraph.text]

data = “.join(data)

res = re.findall(’[√×]‘, data, re.S)

res = [f’{k + 1}.{v}’ for k, v in enumerate(res)]

df = pd.DataFrame(res)

df.to_excel(‘test9-13.xlsx’, index=False, header=None)

Really too strong!

After the code is run, the expected result can be obtained, as shown in the following figure:

Later, based on this code, [Eating Hawthorn Slices Crazy] came up with a simplified version, the code is as follows:

import re

from docx import Document

import pandas as pd

document = Document(r”Judgment (Bracket Processing)(1).docx”)

text = document.part.blob.decode(‘utf-8’)

text = re.sub(r’<.*?>’, “, text)

text = re.sub(r’.\s+‘, r’.‘, text)

df = pd.DataFrame(re.findall(r’\d+.[√×]‘, text))

df.to_excel(‘result.xlsx’, header=None, index=False)

This technology is really home, superb.

After the code runs, this requirement can also be fully realized.

Later, [Mr. Yu Liang] also gave a code, which is also very good, as shown below:

data = [paragraph.text for paragraph in all_paragraphs if ‘√’ in paragraph.text or ‘×’ in paragraph.text]

Combine into one long string, then replace to remove all spaces

data = “.join(data).replace(’ ‘, “)

Use re regular expression to extract all answers with question numbers

res = re.findall(r’\d+.[√×]‘, data, re.S)

df = pd.DataFrame(res)

df.to_excel(‘test9-13.xlsx’, index=False, header=None)

It’s amazing! Replacing and deleting all the extra spaces can prevent the answer from containing spaces and cannot be matched by the regular r’\d+.[√×]‘, so it is done in one step. Stop using list comprehensions to construct answers.

Do you think this is the end?

Later, [Ning] used the openpyxl library to get it done. The code is shown in the following figure:

import re

import docx

import openpyxl

def str_work(string:str):

return [*filter(None,re.split(’.’,re.sub(’\d+‘,“,string.replace(’ ‘, “).replace(’\n’, “))))]

wb = openpyxl.Workbook()

ws = wb.active

ws.append([‘question’,‘answer’])

doc = docx.Document(r’C:\Users\Administrator\Desktop\judgment (bracket processing).docx’)

doc_text = ‘\n’.join(( i.text for i in doc.paragraphs[3:]))

doc_list = doc_text.split(’\nOne, True or False’)

title_row = [i.strip() for i in doc_list[0].split(’\n’) if i.strip().split(‘、’)!=[“]]

answer_row = [i for i in str_work(doc_list[1])]

for i in zip(title_row,answer_row):

ws.append(list(i))

wb.save(‘1.xlsx’)

The result obtained after running is shown in the following figure:

Summary ==========

Hello everyone, I’m Pippi. This article mainly takes stock of a problem of Python automation office. In response to this problem, the article gives specific analysis and code implementation to help fans solve the problem smoothly.

Finally, I would like to thank the fans [Hxy let me fat] for their questions, [Jason], [Ms. Yu Liang], [Eat Hawthorn Slices Crazy], and [Classmate Ning] for their ideas and code analysis, and thanks to [dcpeng], [Postpartum Repair] , [such creatures], [Yu Kefu] and others participated in the study and exchange.

Latest Programming News and Information | GeekBar

Inventory a practical case of Python automation office

I. Introduction

Second, the realization process

Combine into one long string, then replace to remove all spaces

Use re regular expression to extract all answers with question numbers

Related Posts