From Assisted Coding to Integrated R&D

How teams can truly put AI programming to work

Plan / Skill / AGENTS.md / Security boundaries / Effectiveness evaluation

1. First, clarify the concepts

What is AI programming

It is not just code completion
It is not just answering “what does this code mean?”
It is not just generating a function or component

A more accurate term now is:

Agentic Coding

AI works continuously toward a goal:

Read the codebase
Understand the rules
Break down the task
Modify files
Run commands
Execute tests
Report results

The difference between Agentic Coding and traditional AI coding

Dimension	Traditional AI coding	Agentic Coding
Input	A prompt	Repository + rules + tools + permissions
Output	Code snippets / suggestions	A reviewable process and result
Scope of work	Current file / current question	Multi-file / multi-step / long-chain tasks
Tool capabilities	Completion, explanation, generation	Read and write code, run commands, test, call tools
Collaboration style	Q&A-style	Task-based / agent-based

2. What mainstream tools look like

The 4 main forms of mainstream tools today

IDE inline completion tools
Cursor / Copilot / Windsurf
Terminal agent tools
Codex CLI / App / Claude Code
Multi-agent / asynchronous collaboration tools
worktree, review queue, automations
Enterprise workflow integration tools
issue / docs / CI / design / review

IDE inline completion tools

Advantages:

Fast
Does not interrupt flow
Good for daily coding
Good for localized changes

Limitations:

Better suited for short tasks
Limited ability on multi-step tasks
Context usually centers on the current file

Conclusion:

More like an “enhanced editor”

Terminal agent tools

Examples:

Codex App / CLI
Claude Code

Characteristics:

Operate directly on the codebase
Can modify files across directories
Can run commands and tests
Better suited for large and long-running tasks
Closer to a “collaborative developer”

Conclusion:

This is the route in Agentic Coding most worth paying attention to

Multi-agent / asynchronous collaboration tools

Keywords:

Multiple agents in parallel
worktree isolation
review queue
automations
Asynchronous execution of long tasks

What changes:

From “I am talking to one AI”

To

“I am orchestrating a group of agents to do work”

3. What really determines results is not model parameters

Why context engineering matters more

With the same model:

Without rules, output often “looks right”
Without boundaries, it can easily exceed its authority
Without process, rework and review costs are high

Once rules, commands, boundaries, and processes are solidified:

Stability improves
Reusability improves
Team collaboration costs go down

The implementation order I recommend for teams

`AGENTS.md` > Skill > MCP

Reasons:

AGENTS.md solves the need for a unified project-wide understanding
Skills solve the reuse of high-frequency workflows
MCP solves external system integration

Do not do it in reverse.

What `AGENTS.md` is for

It is:

A repository-level persistent instruction manual

What it is suitable for:

Repository structure
What can be changed and what cannot
Test and build commands
Code style and review requirements
Historical pitfalls and business constraints

Value:

Project rules are automatically included in every conversation
Reduces the cost of repeatedly explaining project background

Why Skills are more suitable than MCP for initial team adoption

Skills are better suited to carrying:

code review
changelog generation
YApi docs-sync
issue troubleshooting
release checks
onboarding

In essence, they capture:

Methods, steps, constraints, and scripts

Rather than just “an external connection protocol.”

The 5 advantages of Skills

Better for capturing team methods
Easier to version and review
Usually have a smaller permission surface
More reproducible
Better suited for landing first, then expanding

In one sentence:

Skills are more like “standard operating procedures”

MCP is useful, but not the default answer

MCP is better suited for:

Figma
Jira / Linear
Google Drive / Docs
Slack / external knowledge bases
Real-time data systems

What it solves is:

A connectivity problem

Not:

A team methodology problem

Some practical issues with MCP

It solves connectivity, not methodology
Governance cost is higher
The security surface is larger
It is not suitable for carrying implicit project rules

Typical implicit rules include:

Which directories must not be touched
Which fields must not be changed
Which test suites must be run
Which logs to check first for which kinds of issues

These are better written into AGENTS.md and Skills.

4. The principles need to be explained clearly

How does the model know when to use these mechanisms

Mechanism	Essence	Triggered by
`AGENTS.md`	Repository-level persistent instruction manual	Read by the agent / host when a task starts
`CLAUDE.md`	Claude Code persistent instruction file	Loaded at startup, with subdirectories loaded as needed
Skill	Reusable workflow package	Matched by the model or explicitly specified by the user
MCP Prompt	Template prompt	Triggered by the user
MCP Resource	External context	Attached by the application or referenced by the user
MCP Tool	External action interface	The model decides whether to call it

What is the principle behind MCP

MCP is not a plugin, but a protocol.

Core structure:

host
client
server

Core capabilities:

prompts
resources
tools
sampling

How it works:

initialize
capability negotiation
normal operation

What is the principle behind Skills

A Skill is not a prompt.

It is more like a directory-based workflow package:

SKILL.md
scripts
references
resources

Key point:

Loaded on demand, rather than always occupying context

This is also why Skills can be more efficient than a “large system prompt.”

The difference between `CLAUDE.md` and `AGENTS.md`

What they have in common:

Both are persistent instruction files
Both are used to inject project rules into coding agents

Differences:

CLAUDE.md: Anthropic has published a more detailed loading mechanism
AGENTS.md: OpenAI clearly states that it provides persistent context, but fewer implementation details are publicly available

Conclusion:

Neither is a tool invoker; both are entry points for long-term context

5. Security, sandboxing, and permission control

AI programming cannot be discussed only in terms of efficiency

Once an agent can:

Read files
Modify code
Run commands
Access the network

The risk model changes completely.

What really needs to be discussed is:

Filesystem isolation
Network isolation
Least privilege
Human confirmation
End-to-end traceability

I recommend teams do at least 5 things

Default to least privilege
Restrict filesystem boundaries
Use a network allowlist
Make all privilege overreach visible
Require human confirmation for high-risk actions

Examples of high-risk actions:

Pushing remote branches
Changing production configuration
Deleting large numbers of files
Running database changes
Calling real online write APIs

6. How to measure results, not just demos

Why demos alone are not enough

A demo often only answers:

Does it look smart
Can it generate a decent piece of code

What teams really should ask is:

Can it handle real tasks reliably
Can it be reproduced in the team environment
How much rework and review cost does it introduce
Does it actually improve delivery efficiency

What external benchmarks can tell us

The value of SWE-bench:

Real GitHub issues
A reproducible evaluation environment
It measures whether problems can actually be fixed and tests can pass

What it shows:

Evaluating AI programming cannot rely only on demo videos and one-off examples

But it does not directly mean:

Your team has already improved efficiency

The 8 metrics teams should track more closely

Time to first usable patch
Task completion rate
First-pass merge rate
Manual rework time
Test pass rate
Review rejection rate
Documentation sync rate
Security overreach count

In one sentence:

Look at the quality of real task completion, not just generation speed

7. Back to real team implementation

How we are implementing it now

Use Plan mode to break down work before development
Use a Skill system to solidify high-frequency workflows
Use YApi docs-sync as the unified source of truth for API facts
Use changelogs to preserve a trace of changes
Let AI do the first round of review, with humans making the final judgment

The goal is not “to let AI replace people”

But rather:

Let AI enter the R&D workflow and become stable productive capacity

Which Skills are most valuable in our team

Global-level:

YApi Skill
ZenTao Skill

Project-level:

Log Skill
KV Skill
Database Skill

The value is not in “how advanced” they are

But in:

High-frequency, stable, reusable, reviewable

A real collaboration example

When an API field changes:

Backend: modify code
Backend: use YApi Skill to do docs-sync
Frontend: read YApi changes
Frontend and backend: align integration according to the Plan

Results:

Documentation and code stay more consistent
Less repetitive communication
YApi becomes the single source of truth

The most pragmatic order for team adoption

First write AGENTS.md well
Turn high-frequency tasks into Skills
Connect only necessary external systems through MCP
Only then talk about automations and multi-agent

This order is more stable than “stack tools first, then patch governance later.”

8. In the end, just remember 4 sentences

Core conclusions

AI programming has moved from “code completion” into “Agentic Coding”
The key to team adoption is not model parameters, but context engineering
The default priority should be AGENTS.md > Skill > MCP
AI programming should not be evaluated only by demos, but by boundaries, quality, and real R&D metrics

References

OpenAI, "Introducing Codex"
OpenAI, "Introducing the Codex app"
OpenAI PDF, "How OpenAI uses Codex"
Anthropic, "Claude Code overview"
Anthropic, "How Claude remembers your project"
Anthropic, "Making Claude Code more secure and autonomous with sandboxing"
MCP official specification / MCP Architecture
SWE-bench official Overview

Thank you

Discussion keywords:

Agentic Coding / Skill / MCP / AGENTS.md / CLAUDE.md

# From Assisted Coding to Integrated R&D: Putting AI Programming into Team Practice (PPT)

From Assisted Coding to Integrated R&D

How teams can truly put AI programming to work

Admin Content Controls

Comments