首先需要解决的就是上次留下的问题, 添加自定义的 taint track.
在自带的 tests 中就有示例, 可以参考 ql/python/ql/test/library-tests/taint/extensions/ExtensionsLib.qll
最后大概是这样
1class AnyCallFlow extends DataFlowExtension::DataFlowNode {
2 AnyCallFlow() {
3 exists(CallNode call |
4 call.getFunction().(AttrNode).getObject() = this
5 )
6 }
7
8 override ControlFlowNode getASuccessorNode() {
9 result.(CallNode).getFunction().(AttrNode).getObject() = this
10 }
11}
意思就是如果一个 funccall 中是 val.attr 类型的, 且 val 被 taint 了, 那么整个 CallNode 都将被 taint.
然后加到 Configuration 里面就可以了
1override predicate isExtension(TaintTracking::Extension extension) {
2 extension instanceof AnyCallFlow
3}
此时就能够识别 split 等方法了, 不过这样的结果肯定是增加误报率了.
这里插一句, 最近在看南大开在 B 站上的软件分析课程, 讲的挺好, 这里其实就是 soundness completeness 问题, 在安全这一块还是 soundness 好一点, 所以最好还是牺牲虚警率来提高漏报率吧.
最后按照官方库的方法, 封装一下, 最后的结果
1import python
2import semmle.python.security.TaintTracking
3import semmle.python.web.flask.Request
4
5class AnyCallFlow extends DataFlowExtension::DataFlowNode {
6 AnyCallFlow() {
7 exists(CallNode call |
8 call.getFunction().(AttrNode).getObject() = this
9 )
10 }
11
12 override ControlFlowNode getASuccessorNode() {
13 result.(CallNode).getFunction().(AttrNode).getObject() = this
14 }
15}
16
17class DangerousFunctionArg0 extends Value {
18 DangerousFunctionArg0() {
19 exists(Value val |
20 this = val and
21 (
22 val = Value::named("subprocess.check_output") or
23 val = Value::named("os.system") or
24 val = Value::named("os.popen") or
25 val = Value::named("eval") or
26 val = Value::named("exec") or
27 val = Value::named("flask.render_template_string")
28 )
29 )
30 }
31}
32
33class DangerousFunctionArg0Sink extends TaintSink {
34 DangerousFunctionArg0Sink() {
35 exists(
36 CallNode call, DangerousFunctionArg0 dangerous_func |
37 call.getFunction().pointsTo(dangerous_func) and
38 call.getArg(0) = this
39 )
40 }
41
42 override predicate sinks(TaintKind taint) {
43 any()
44 }
45}
46
47class SystemCommandExecution extends TaintTracking::Configuration {
48 SystemCommandExecution() { this = "SystemCommandExecution Tracking" }
49
50 override predicate isSource(DataFlow::Node src, TaintKind kind) {
51 src.asCfgNode() instanceof FlaskRequestArgs
52 }
53
54 override predicate isSink(DataFlow::Node sink, TaintKind kind) {
55 sink.asCfgNode() instanceof DangerousFunctionArg0Sink
56 }
57
58 override predicate isExtension(TaintTracking::Extension extension) {
59 extension instanceof AnyCallFlow
60 }
61}
62
63from SystemCommandExecution config, DataFlow::Node src, DataFlow::Node sink
64where config.hasSimpleFlow(src, sink)
65select sink, src
检测以下 sample, 一共 10 个漏洞, 都能找到, 还是不错的
1import flask
2import subprocess
3from subprocess import check_output
4from flask import request
5
6app = flask.Flask(__name__)
7
8def passby(i):
9 return i.split('123')
10
11@app.route('/index')
12def index():
13 return subprocess.check_output(flask.request.args.get('c', 'ls'))
14
15@app.route('/index2')
16def index2():
17 tmp = flask.request.args.get('c', 'ls')
18 tmp = tmp.split('|')
19 return subprocess.check_output(tmp)
20
21@app.route('/index3')
22def index3():
23 tmp = flask.request.args.get('c', 'ls')
24 tmp = tmp.split('|')
25 return check_output(tmp)
26
27@app.route('/index4')
28def index4():
29 tmp = request.args.get('c', 'ls')
30 tmp = tmp.split('|')
31 return subprocess.check_output(tmp)
32
33@app.route('/index5')
34def index5():
35 tmp = flask.request.args.get('c', 'ls')
36 tmp = tmp + "i"
37 return subprocess.check_output(tmp)
38
39@app.route('/index6')
40def index6():
41 tmp = request.args.get('c', 'ls')
42 tmp = tmp + "i"
43 return subprocess.check_output(tmp)
44
45@app.route('/index7')
46def index7():
47 tmp = request.args.get('c', 'ls')
48 tmp = tmp + "i"
49 return check_output(tmp)
50
51@app.route('/index8')
52def index8():
53 tmp = request.args.get('c', 'ls')
54 tmp = tmp + "i"
55 return flask.render_template_string(tmp)
56
57@app.route('/index9')
58def index9():
59 tmp = request.args.get('c', 'ls')
60 tmp = tmp + "i"
61 return flask.render_template_string("asd", t=tmp)
62
63@app.route('/index10')
64def index10():
65 tmp = request.args.get('c', 'ls')
66 tmp = passby(tmp + "i")
67 return flask.render_template_string("asd", t=tmp)
68
69@app.route('/index11')
70def index11():
71 tmp = request.args.get('c', 'ls')
72 tmp = passby(tmp + "i")
73 return eval(tmp)
74
75@app.route('/index12')
76def index12():
77 tmp = request.args.get('c', 'ls')
78 tmp = passby(tmp + "i")
79 return flask.render_template_string(tmp)
80
81app.run()
最后, 其实感觉编写最大的难点还是需要思维的转换, 这种声明式的语言像 SQL 一样, 是告诉程序, 希望在 xx 地方是 xx, 且 xx 里面的 yy 是 zz 这样.
需要一点时间来转变思维吧, 之后是这个官方 python 接口库感觉本身写的就有点乱 (逃, 各种类似的对象, 又是 PythonFunctionCall, CallNode 的, 同样的目的可以由一万种不同的方式达成. 感觉对新手确实不太友好. 等后续文档跟上吧.